Abstract: Hierarchical classification is a special type of classification task where the class labels are organised into a hierarchy, with more generic class labels being ancestors of more specific ones. Meta-learning for classification-algorithm recommendation consists of recommending to the user a classification algorithm, from a pool of candidate algorithms, for a dataset, based on the past performance of the candidate algorithms in other datasets. Meta-learning is normally used in conventional, non-hierarchical classification. By contrast, this paper proposes a meta-learning approach for more challenging task of hierarchical classification, and evaluates it in a large number of bioinformatics datasets. Hierarchical classification is especially relevant for bioinformatics problems, as protein and gene functions tend to be organised into a hierarchy of class labels. This work proposes meta-learning approach for
recommending the best hierarchical classification algorithm to a
hierarchical classification dataset. This work’s contributions are: 1)
proposing an algorithm for splitting hierarchical datasets into
new datasets to increase the number of meta-instances, 2) proposing
meta-features for hierarchical classification, and 3) interpreting
decision-tree meta-models for hierarchical classification algorithm
recommendation.
Abstract: ISO/IEC/IEEE 15288: 2015, Systems and Software Engineering - System Life Cycle Processes is an international standard that provides generic top-level process descriptions to support systems engineering (SE). However, the processes defined in the standard needs improvement to lift integrity and consistency. The goal of this research is to explore the way by building an ontology model for the SE standard to manage the knowledge of SE. The ontology model gives a whole picture of the SE knowledge domain by building connections between SE concepts. Moreover, it creates a hierarchical classification of the concepts to fulfil different requirements of displaying and analysing SE knowledge.
Abstract: The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.
Abstract: Hierarchical classification is a problem with applications in many areas as protein function prediction where the dates are hierarchically structured. Therefore, it is necessary the development of algorithms able to induce hierarchical classification models. This paper presents experimenters using the algorithm for hierarchical classification called Multi-label Hierarchical Classification using a Competitive Neural Network (MHC-CNN). It was tested in ten datasets the Gene Ontology (GO) Cellular Component Domain. The results are compared with the Clus-HMC and Clus-HSC using the hF-Measure.
Abstract: Classifying biomedical literature is a difficult and
challenging task, especially when a large number of biomedical
articles should be organized into a hierarchical structure. In this paper,
we present an approach for classifying a collection of biomedical text
abstracts downloaded from Medline database with the help of
ontology alignment. To accomplish our goal, we construct two types
of hierarchies, the OHSUMED disease hierarchy and the Medline
abstract disease hierarchies from the OHSUMED dataset and the
Medline abstracts, respectively. Then, we enrich the OHSUMED
disease hierarchy before adapting it to ontology alignment process for
finding probable concepts or categories. Subsequently, we compute
the cosine similarity between the vector in probable concepts (in the
“enriched" OHSUMED disease hierarchy) and the vector in Medline
abstract disease hierarchies. Finally, we assign category to the new
Medline abstracts based on the similarity score. The results obtained
from the experiments show the performance of our proposed approach
for hierarchical classification is slightly better than the performance of
the multi-class flat classification.
Abstract: Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.
Abstract: There are several approaches for handling multiclass classification. Aside from one-against-one (OAO) and one-against-all (OAA), hierarchical classification technique is also commonly used. A binary classification tree is a hierarchical classification structure that breaks down a k-class problem into binary sub-problems, each solved by a binary classifier. In each node, a set of classes is divided into two subsets. A good class partition should be able to group similar classes together. Many algorithms measure similarity in term of distance between class centroids. Classes are grouped together by a clustering algorithm when distances between their centroids are small. In this paper, we present a binary classification tree with tuned observation-based clustering (BCT-TOB) that finds a class partition by performing clustering on observations instead of class centroids. A merging step is introduced to merge any insignificant class split. The experiment shows that performance of BCT-TOB is comparable to other algorithms.