Genetic Algorithms for Feature Generation in the Context of Audio Classification

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Automatic Musical Genre Classification Using Divergence and Average Information Measures

Recently many research has been conducted to retrieve pertinent parameters and adequate models for automatic music genre classification. In this paper, two measures based upon information theory concepts are investigated for mapping the features space to decision space. A Gaussian Mixture Model (GMM) is used as a baseline and reference system. Various strategies are proposed for training and testing sessions with matched or mismatched conditions, long training and long testing, long training and short testing. For all experiments, the file sections used for testing are never been used during training. With matched conditions all examined measures yield the best and similar scores (almost 100%). With mismatched conditions, the proposed measures yield better scores than the GMM baseline system, especially for the short testing case. It is also observed that the average discrimination information measure is most appropriate for music category classifications and on the other hand the divergence measure is more suitable for music subcategory classifications.

Using HMM-based Classifier Adapted to Background Noises with Improved Sounds Features for Audio Surveillance Application

Discrimination between different classes of environmental sounds is the goal of our work. The use of a sound recognition system can offer concrete potentialities for surveillance and security applications. The first paper contribution to this research field is represented by a thorough investigation of the applicability of state-of-the-art audio features in the domain of environmental sound recognition. Additionally, a set of novel features obtained by combining the basic parameters is introduced. The quality of the features investigated is evaluated by a HMM-based classifier to which a great interest was done. In fact, we propose to use a Multi-Style training system based on HMMs: one recognizer is trained on a database including different levels of background noises and is used as a universal recognizer for every environment. In order to enhance the system robustness by reducing the environmental variability, we explore different adaptation algorithms including Maximum Likelihood Linear Regression (MLLR), Maximum A Posteriori (MAP) and the MAP/MLLR algorithm that combines MAP and MLLR. Experimental evaluation shows that a rather good recognition rate can be reached, even under important noise degradation conditions when the system is fed by the convenient set of features.