Abstract: Classifying data hierarchically is an efficient approach
to analyze data. Data is usually classified into multiple categories, or
annotated with a set of labels. To analyze multi-labeled data, such
data must be specified by giving a set of labels as a semantic range.
There are some certain purposes to analyze data. This paper shows
which multi-labeled data should be the target to be analyzed for
those purposes, and discusses the role of a label against a set of
labels by investigating the change when a label is added to the set of
labels. These discussions give the methods for the advanced analysis
of multi-labeled data, which are based on the role of a label against
a semantic range.
Abstract: In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Abstract: In this paper, Land Marks for Unique Addressing( LMUA) algorithm is develped to generate unique ID for each and every node which leads to the formation of overlapping/Non overlapping clusters based on unique ID. To overcome the draw back of the developed LMUA algorithm, the concept of clustering is introduced. Based on the clustering concept a Land Marks for Unique Addressing and Clustering(LMUAC) Algorithm is developed to construct strictly non-overlapping clusters and classify those nodes in to Cluster Heads, Member Nodes, Gate way nodes and generating the Hierarchical code for the cluster heads to operate in the level one hierarchy for wireless communication switching. The expansion of the existing network can be performed or not without modifying the cost of adding the clusterhead is shown. The developed algorithm shows one way of efficiently constructing the
Abstract: An emotional speech recognition system for the
applications on smart phones was proposed in this study to combine
with 3G mobile communications and social networks to provide users
and their groups with more interaction and care. This study developed
a mechanism using the support vector machines (SVM) to recognize
the emotions of speech such as happiness, anger, sadness and normal.
The mechanism uses a hierarchical classifier to adjust the weights of
acoustic features and divides various parameters into the categories of
energy and frequency for training. In this study, 28 commonly used
acoustic features including pitch and volume were proposed for
training. In addition, a time-frequency parameter obtained by
continuous wavelet transforms was also used to identify the accent and
intonation in a sentence during the recognition process. The Berlin
Database of Emotional Speech was used by dividing the speech into
male and female data sets for training. According to the experimental
results, the accuracies of male and female test sets were increased by
4.6% and 5.2% respectively after using the time-frequency parameter
for classifying happy and angry emotions. For the classification of all
emotions, the average accuracy, including male and female data, was
63.5% for the test set and 90.9% for the whole data set.
Abstract: Understanding the cell's large-scale organization is an interesting task in computational biology. Thus, protein-protein interactions can reveal important organization and function of the cell. Here, we investigated the correspondence between protein interactions and function for the yeast. We obtained the correlations among the set of proteins. Then these correlations are clustered using both the hierarchical and biclustering methods. The detailed analyses of proteins in each cluster were carried out by making use of their functional annotations. As a result, we found that some functional classes appear together in almost all biclusters. On the other hand, in hierarchical clustering, the dominancy of one functional class is observed. In the light of the clustering data, we have verified some interactions which were not identified as core interactions in DIP and also, we have characterized some functionally unknown proteins according to the interaction data and functional correlation. In brief, from interaction data to function, some correlated results are noticed about the relationship between interaction and function which might give clues about the organization of the proteins, also to predict new interactions and to characterize functions of unknown proteins.
Abstract: In this paper we present a novel approach for wavelet compression of electrocardiogram (ECG) signals based on the set partitioning in hierarchical trees (SPIHT) coding algorithm. SPIHT algorithm has achieved prominent success in image compression. Here we use a modified version of SPIHT for one dimensional signals. We applied wavelet transform with SPIHT coding algorithm on different records of MIT-BIH database. The results show the high efficiency of this method in ECG compression.
Abstract: An IEC technique is described for a multi-objective
search of conceptual solutions. The survivability of solutions is
influenced by both model-based fitness and subjective human
preferences. The concepts- preferences are articulated via a hierarchy
of sub-concepts. The suggested method produces an objectivesubjective
front. Academic example is employed to demonstrate the
proposed approach.
Abstract: The aim of this work was to detect genetic variability among the set of 40 castor genotypes using 8 RAPD markers. Amplification of genomic DNA of 40 genotypes, using RAPD analysis, yielded in 66 fragments, with an average of 8.25 polymorphic fragments per primer. Number of amplified fragments ranged from 3 to 13, with the size of amplicons ranging from 100 to 1200 bp. Values of the polymorphic information content (PIC) value ranged from 0.556 to 0.895 with an average of 0.784 and diversity index (DI) value ranged from 0.621 to 0.896 with an average of 0.798. The dendrogram based on hierarchical cluster analysis using UPGMA algorithm was prepared and analyzed genotypes were grouped into two main clusters and only two genotypes could not be distinguished. Knowledge on the genetic diversity of castor can be used for future breeding programs for increased oil production for industrial uses.
Abstract: This paper suggests an algorithm for the evaluation
and selection of suppliers. At the beginning, all the needed materials and services used by the organization were identified and categorized
with regard to their nature by ABC method. Afterwards, in order to reduce risk factors and maximize the organization's profit, purchase strategies were determined. Then, appropriate criteria were identified for primary evaluation of suppliers applying to the organization. The output of this stage was a list of suppliers qualified by the organization to participate in its tenders. Subsequently, considering a material in particular, appropriate criteria on the ordering of the
mentioned material were determined, taking into account the particular materials' specifications as well as the organization's needs. Finally, for the purpose of validation and verification of the
proposed model, it was applied to Mobarakeh Steel Company (MSC), the qualified suppliers of this Company are ranked by the means of a Hierarchical Fuzzy TOPSIS method. The obtained results
show that the proposed algorithm is quite effective, efficient and easy to apply.
Abstract: Large scale systems such as computational Grid is
a distributed computing infrastructure that can provide globally
available network resources. The evolution of information processing
systems in Data Grid is characterized by a strong decentralization of
data in several fields whose objective is to ensure the availability and
the reliability of the data in the reason to provide a fault tolerance
and scalability, which cannot be possible only with the use of the
techniques of replication. Unfortunately the use of these techniques
has a height cost, because it is necessary to maintain consistency
between the distributed data. Nevertheless, to agree to live with
certain imperfections can improve the performance of the system by
improving competition. In this paper, we propose a multi-layer protocol
combining the pessimistic and optimistic approaches conceived
for the data consistency maintenance in large scale systems. Our
approach is based on a hierarchical representation model with tree
layers, whose objective is with double vocation, because it initially
makes it possible to reduce response times compared to completely
pessimistic approach and it the second time to improve the quality
of service compared to an optimistic approach.
Abstract: In this paper, an improvement of PDLZW implementation
with a new dictionary updating technique is proposed. A
unique dictionary is partitioned into hierarchical variable word-width
dictionaries. This allows us to search through dictionaries in parallel.
Moreover, the barrel shifter is adopted for loading a new input string
into the shift register in order to achieve a faster speed. However,
the original PDLZW uses a simple FIFO update strategy, which is
not efficient. Therefore, a new window based updating technique
is implemented to better classify the difference in how often each
particular address in the window is referred. The freezing policy
is applied to the address most often referred, which would not be
updated until all the other addresses in the window have the same
priority. This guarantees that the more often referred addresses would
not be updated until their time comes. This updating policy leads
to an improvement on the compression efficiency of the proposed
algorithm while still keep the architecture low complexity and easy
to implement.
Abstract: The paper contains a review of the literature in terms of the critical analysis of methodologies of university ranking systems. Furthermore, the initiatives supported by the European Commission (U-Map, U-Multirank) and CHE Ranking are described. Special attention is paid to the tendencies in the development of ranking systems. According to the author, the ranking organizations should abandon the classic form of ranking, namely a hierarchical ordering of universities from “the best" to “the worse". In the empirical part of this paper, using one of the method of cluster analysis called k-means clustering, the author presents university classifications of the top universities from the Shanghai Jiao Tong University-s (SJTU) Academic Ranking of World Universities (ARWU).
Abstract: This paper proposes a way to track persons by making use of multiple non-overlapping cameras. Tracking persons on multiple non-overlapping cameras enables data communication among cameras through the network connection between a camera and a computer, while at the same time transferring human feature data captured by a camera to another camera that is connected via the network. To track persons with a camera and send the tracking data to another camera, the proposed system uses a hierarchical human model that comprises a head, a torso, and legs. The feature data of the person being modeled are transferred to the server, after which the server sends the feature data of the human model to the cameras connected over the network. This enables a camera that captures a person's movement entering its vision to keep tracking the recognized person with the use of the feature data transferred from the server.
Abstract: Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.
Abstract: This research presents a system for post processing of
data that takes mined flat rules as input and discovers crisp as well as
fuzzy hierarchical structures using Learning Classifier System
approach. Learning Classifier System (LCS) is basically a machine
learning technique that combines evolutionary computing,
reinforcement learning, supervised or unsupervised learning and
heuristics to produce adaptive systems. A LCS learns by interacting
with an environment from which it receives feedback in the form of
numerical reward. Learning is achieved by trying to maximize the
amount of reward received. Crisp description for a concept usually
cannot represent human knowledge completely and practically. In the
proposed Learning Classifier System initial population is constructed
as a random collection of HPR–trees (related production rules) and
crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is
suggested for the proposed system and based on Subsumption Matrix
(SM), a suitable fitness function is proposed. Suitable genetic
operators are proposed for the chosen chromosome representation
method. For implementing reinforcement a suitable reward and
punishment scheme is also proposed. Experimental results are
presented to demonstrate the performance of the proposed system.
Abstract: In this paper, we represent protein structure by using
graph. A protein structure database will become a graph database.
Each graph is represented by a spectral vector. We use Jacobi
rotation algorithm to calculate the eigenvalues of the normalized
Laplacian representation of adjacency matrix of graph. To measure
the similarity between two graphs, we calculate the Euclidean
distance between two graph spectral vectors. To cluster the graphs,
we use M-tree with the Euclidean distance to cluster spectral vectors.
Besides, M-tree can be used for graph searching in graph database.
Our proposal method was tested with graph database of 100 graphs
representing 100 protein structures downloaded from Protein Data
Bank (PDB) and we compare the result with the SCOP hierarchical
structure.
Abstract: Biological data has several characteristics that strongly differentiate it from typical business data. It is much more complex, usually large in size, and continuously changes. Until recently business data has been the main target for discovering trends, patterns or future expectations. However, with the recent rise in biotechnology, the powerful technology that was used for analyzing business data is now being applied to biological data. With the advanced technology at hand, the main trend in biological research is rapidly changing from structural DNA analysis to understanding cellular functions of the DNA sequences. DNA chips are now being used to perform experiments and DNA analysis processes are being used by researchers. Clustering is one of the important processes used for grouping together similar entities. There are many clustering algorithms such as hierarchical clustering, self-organizing maps, K-means clustering and so on. In this paper, we propose a clustering algorithm that imitates the ecosystem taking into account the features of biological data. We implemented the system using an Ant-Colony clustering algorithm. The system decides the number of clusters automatically. The system processes the input biological data, runs the Ant-Colony algorithm, draws the Topic Map, assigns clusters to the genes and displays the output. We tested the algorithm with a test data of 100 to1000 genes and 24 samples and show promising results for applying this algorithm to clustering DNA chip data.
Abstract: Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.
Abstract: The role of neighborhood center as semi public (the
balance space) is disappeared in bonding between private and public
in new urbanism. In this way, a hierarchical principle in the
traditional neighborhood center appears to create or develop the
conditions for residents` relationships and belonging. This paper
evaluates significant of hierarchical principles of the neighborhood
center in residents` territoriality and its factors. In this way Miandeh
neighborhood center from Boshrooyeh city was determined as a case
study area. Results indicated that a hierarchical principle is the best
instrument to improve the territoriality as the subcomponent of place
belonging in residents. The findings help the urban designer to
revitalization the neighborhoods and proceedings in organization of
physical space.
Abstract: In text categorization problem the most used method
for documents representation is based on words frequency vectors
called VSM (Vector Space Model). This representation is based only
on words from documents and in this case loses any “word context"
information found in the document. In this article we make a
comparison between the classical method of document representation
and a method called Suffix Tree Document Model (STDM) that is
based on representing documents in the Suffix Tree format. For the
STDM model we proposed a new approach for documents
representation and a new formula for computing the similarity
between two documents. Thus we propose to build the suffix tree
only for any two documents at a time. This approach is faster, it has
lower memory consumption and use entire document representation
without using methods for disposing nodes. Also for this method is
proposed a formula for computing the similarity between documents,
which improves substantially the clustering quality. This
representation method was validated using HAC - Hierarchical
Agglomerative Clustering. In this context we experiment also the
stemming influence in the document preprocessing step and highlight
the difference between similarity or dissimilarity measures to find
“closer" documents.