Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Non-Overlapping Hierarchical Index Structure for Similarity Search

In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.

Development of a Clustered Network based on Unique Hop ID

In this paper, Land Marks for Unique Addressing( LMUA) algorithm is develped to generate unique ID for each and every node which leads to the formation of overlapping/Non overlapping clusters based on unique ID. To overcome the draw back of the developed LMUA algorithm, the concept of clustering is introduced. Based on the clustering concept a Land Marks for Unique Addressing and Clustering(LMUAC) Algorithm is developed to construct strictly non-overlapping clusters and classify those nodes in to Cluster Heads, Member Nodes, Gate way nodes and generating the Hierarchical code for the cluster heads to operate in the level one hierarchy for wireless communication switching. The expansion of the existing network can be performed or not without modifying the cost of adding the clusterhead is shown. The developed algorithm shows one way of efficiently constructing the

Applications of Support Vector Machines on Smart Phone Systems for Emotional Speech Recognition

An emotional speech recognition system for the applications on smart phones was proposed in this study to combine with 3G mobile communications and social networks to provide users and their groups with more interaction and care. This study developed a mechanism using the support vector machines (SVM) to recognize the emotions of speech such as happiness, anger, sadness and normal. The mechanism uses a hierarchical classifier to adjust the weights of acoustic features and divides various parameters into the categories of energy and frequency for training. In this study, 28 commonly used acoustic features including pitch and volume were proposed for training. In addition, a time-frequency parameter obtained by continuous wavelet transforms was also used to identify the accent and intonation in a sentence during the recognition process. The Berlin Database of Emotional Speech was used by dividing the speech into male and female data sets for training. According to the experimental results, the accuracies of male and female test sets were increased by 4.6% and 5.2% respectively after using the time-frequency parameter for classifying happy and angry emotions. For the classification of all emotions, the average accuracy, including male and female data, was 63.5% for the test set and 90.9% for the whole data set.

Correspondence between Function and Interaction in Protein Interaction Network of Saccaromyces cerevisiae

Understanding the cell's large-scale organization is an interesting task in computational biology. Thus, protein-protein interactions can reveal important organization and function of the cell. Here, we investigated the correspondence between protein interactions and function for the yeast. We obtained the correlations among the set of proteins. Then these correlations are clustered using both the hierarchical and biclustering methods. The detailed analyses of proteins in each cluster were carried out by making use of their functional annotations. As a result, we found that some functional classes appear together in almost all biclusters. On the other hand, in hierarchical clustering, the dominancy of one functional class is observed. In the light of the clustering data, we have verified some interactions which were not identified as core interactions in DIP and also, we have characterized some functionally unknown proteins according to the interaction data and functional correlation. In brief, from interaction data to function, some correlated results are noticed about the relationship between interaction and function which might give clues about the organization of the proteins, also to predict new interactions and to characterize functions of unknown proteins.

Wavelet Compression of ECG Signals Using SPIHT Algorithm

In this paper we present a novel approach for wavelet compression of electrocardiogram (ECG) signals based on the set partitioning in hierarchical trees (SPIHT) coding algorithm. SPIHT algorithm has achieved prominent success in image compression. Here we use a modified version of SPIHT for one dimensional signals. We applied wavelet transform with SPIHT coding algorithm on different records of MIT-BIH database. The results show the high efficiency of this method in ECG compression.

Interactive Concept-based Search using MOEA:The Hierarchical Preferences Case

An IEC technique is described for a multi-objective search of conceptual solutions. The survivability of solutions is influenced by both model-based fitness and subjective human preferences. The concepts- preferences are articulated via a hierarchy of sub-concepts. The suggested method produces an objectivesubjective front. Academic example is employed to demonstrate the proposed approach.

RAPD Analysis of Genetic Diversity of Castor Bean

The aim of this work was to detect genetic variability among the set of 40 castor genotypes using 8 RAPD markers. Amplification of genomic DNA of 40 genotypes, using RAPD analysis, yielded in 66 fragments, with an average of 8.25 polymorphic fragments per primer. Number of amplified fragments ranged from 3 to 13, with the size of amplicons ranging from 100 to 1200 bp. Values of the polymorphic information content (PIC) value ranged from 0.556 to 0.895 with an average of 0.784 and diversity index (DI) value ranged from 0.621 to 0.896 with an average of 0.798. The dendrogram based on hierarchical cluster analysis using UPGMA algorithm was prepared and analyzed genotypes were grouped into two main clusters and only two genotypes could not be distinguished. Knowledge on the genetic diversity of castor can be used for future breeding programs for increased oil production for industrial uses.

A New Framework for Evaluation and Prioritization of Suppliers using a Hierarchical Fuzzy TOPSIS

This paper suggests an algorithm for the evaluation and selection of suppliers. At the beginning, all the needed materials and services used by the organization were identified and categorized with regard to their nature by ABC method. Afterwards, in order to reduce risk factors and maximize the organization's profit, purchase strategies were determined. Then, appropriate criteria were identified for primary evaluation of suppliers applying to the organization. The output of this stage was a list of suppliers qualified by the organization to participate in its tenders. Subsequently, considering a material in particular, appropriate criteria on the ordering of the mentioned material were determined, taking into account the particular materials' specifications as well as the organization's needs. Finally, for the purpose of validation and verification of the proposed model, it was applied to Mobarakeh Steel Company (MSC), the qualified suppliers of this Company are ranked by the means of a Hierarchical Fuzzy TOPSIS method. The obtained results show that the proposed algorithm is quite effective, efficient and easy to apply.

A Consistency Protocol Multi-Layer for Replicas Management in Large Scale Systems

Large scale systems such as computational Grid is a distributed computing infrastructure that can provide globally available network resources. The evolution of information processing systems in Data Grid is characterized by a strong decentralization of data in several fields whose objective is to ensure the availability and the reliability of the data in the reason to provide a fault tolerance and scalability, which cannot be possible only with the use of the techniques of replication. Unfortunately the use of these techniques has a height cost, because it is necessary to maintain consistency between the distributed data. Nevertheless, to agree to live with certain imperfections can improve the performance of the system by improving competition. In this paper, we propose a multi-layer protocol combining the pessimistic and optimistic approaches conceived for the data consistency maintenance in large scale systems. Our approach is based on a hierarchical representation model with tree layers, whose objective is with double vocation, because it initially makes it possible to reduce response times compared to completely pessimistic approach and it the second time to improve the quality of service compared to an optimistic approach.

An Improvement of PDLZW implementation with a Modified WSC Updating Technique on FPGA

In this paper, an improvement of PDLZW implementation with a new dictionary updating technique is proposed. A unique dictionary is partitioned into hierarchical variable word-width dictionaries. This allows us to search through dictionaries in parallel. Moreover, the barrel shifter is adopted for loading a new input string into the shift register in order to achieve a faster speed. However, the original PDLZW uses a simple FIFO update strategy, which is not efficient. Therefore, a new window based updating technique is implemented to better classify the difference in how often each particular address in the window is referred. The freezing policy is applied to the address most often referred, which would not be updated until all the other addresses in the window have the same priority. This guarantees that the more often referred addresses would not be updated until their time comes. This updating policy leads to an improvement on the compression efficiency of the proposed algorithm while still keep the architecture low complexity and easy to implement.

University Ranking Systems – From League Table to Homogeneous Groups of Universities

The paper contains a review of the literature in terms of the critical analysis of methodologies of university ranking systems. Furthermore, the initiatives supported by the European Commission (U-Map, U-Multirank) and CHE Ranking are described. Special attention is paid to the tendencies in the development of ranking systems. According to the author, the ranking organizations should abandon the classic form of ranking, namely a hierarchical ordering of universities from “the best" to “the worse". In the empirical part of this paper, using one of the method of cluster analysis called k-means clustering, the author presents university classifications of the top universities from the Shanghai Jiao Tong University-s (SJTU) Academic Ranking of World Universities (ARWU).

Model-Based Person Tracking Through Networked Cameras

This paper proposes a way to track persons by making use of multiple non-overlapping cameras. Tracking persons on multiple non-overlapping cameras enables data communication among cameras through the network connection between a camera and a computer, while at the same time transferring human feature data captured by a camera to another camera that is connected via the network. To track persons with a camera and send the tracking data to another camera, the proposed system uses a hierarchical human model that comprises a head, a torso, and legs. The feature data of the person being modeled are transferred to the server, after which the server sends the feature data of the human model to the cameras connected over the network. This enables a camera that captures a person's movement entering its vision to keep tracking the recognized person with the use of the feature data transferred from the server.

Multi-labeled Data Expressed by a Set of Labels

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Learning Classifier Systems Approach for Automated Discovery of Crisp and Fuzzy Hierarchical Production Rules

This research presents a system for post processing of data that takes mined flat rules as input and discovers crisp as well as fuzzy hierarchical structures using Learning Classifier System approach. Learning Classifier System (LCS) is basically a machine learning technique that combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. Crisp description for a concept usually cannot represent human knowledge completely and practically. In the proposed Learning Classifier System initial population is constructed as a random collection of HPR–trees (related production rules) and crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is suggested for the proposed system and based on Subsumption Matrix (SM), a suitable fitness function is proposed. Suitable genetic operators are proposed for the chosen chromosome representation method. For implementing reinforcement a suitable reward and punishment scheme is also proposed. Experimental results are presented to demonstrate the performance of the proposed system.

Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures

In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.

An Ant-based Clustering System for Knowledge Discovery in DNA Chip Analysis Data

Biological data has several characteristics that strongly differentiate it from typical business data. It is much more complex, usually large in size, and continuously changes. Until recently business data has been the main target for discovering trends, patterns or future expectations. However, with the recent rise in biotechnology, the powerful technology that was used for analyzing business data is now being applied to biological data. With the advanced technology at hand, the main trend in biological research is rapidly changing from structural DNA analysis to understanding cellular functions of the DNA sequences. DNA chips are now being used to perform experiments and DNA analysis processes are being used by researchers. Clustering is one of the important processes used for grouping together similar entities. There are many clustering algorithms such as hierarchical clustering, self-organizing maps, K-means clustering and so on. In this paper, we propose a clustering algorithm that imitates the ecosystem taking into account the features of biological data. We implemented the system using an Ant-Colony clustering algorithm. The system decides the number of clusters automatically. The system processes the input biological data, runs the Ant-Colony algorithm, draws the Topic Map, assigns clusters to the genes and displays the output. We tested the algorithm with a test data of 100 to1000 genes and 24 samples and show promising results for applying this algorithm to clustering DNA chip data.

Generating Concept Trees from Dynamic Self-organizing Map

Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.

Sense of Territoriality and Revitalization of Neighborhood Centers in Boshrooyeh City

The role of neighborhood center as semi public (the balance space) is disappeared in bonding between private and public in new urbanism. In this way, a hierarchical principle in the traditional neighborhood center appears to create or develop the conditions for residents` relationships and belonging. This paper evaluates significant of hierarchical principles of the neighborhood center in residents` territoriality and its factors. In this way Miandeh neighborhood center from Boshrooyeh city was determined as a case study area. Results indicated that a hierarchical principle is the best instrument to improve the territoriality as the subcomponent of place belonging in residents. The findings help the urban designer to revitalization the neighborhoods and proceedings in organization of physical space.

Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering

In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.