Improved Closed Set Text-Independent Speaker Identification by Combining MFCC with Evidence from Flipped Filter Banks

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for SI applications. However, due to the structure of its filter bank, it captures vocal tract characteristics more effectively in the lower frequency regions. This paper proposes a new set of features using a complementary filter bank structure which improves distinguishability of speaker specific cues present in the higher frequency zone. Unlike high level features that are difficult to extract, the proposed feature set involves little computational burden during the extraction process. When combined with MFCC via a parallel implementation of speaker models, the proposed feature set outperforms baseline MFCC significantly. This proposition is validated by experiments conducted on two different kinds of public databases namely YOHO (microphone speech) and POLYCOST (telephone speech) with Gaussian Mixture Models (GMM) as a Classifier for various model orders.

Economic Evaluations Using Genetic Algorithms to Determine the Territorial Impact Caused by High Speed Railways

The evolution of technology and construction techniques has enabled the upgrading of transport networks. In particular, the high-speed rail networks allow convoys to peak at above 300 km/h. These structures, however, often significantly impact the surrounding environment. Among the effects of greater importance are the ones provoked by the soundwave connected to train transit. The wave propagation affects the quality of life in areas surrounding the tracks, often for several hundred metres. There are substantial damages to properties (buildings and land), in terms of market depreciation. The present study, integrating expertise in acoustics, computering and evaluation fields, outlines a useful model to select project paths so as to minimize the noise impact and reduce the causes of possible litigation. It also facilitates the rational selection of initiatives to contain the environmental damage to the already existing railway tracks. The research is developed with reference to the Italian regulatory framework (usually more stringent than European and international standards) and refers to a case study concerning the high speed network in Italy.

Environmental Capacity and Sustainability of European Regional Airports: A Case Study

Airport capacity has always been perceived in the traditional sense as the number of aircraft operations during a specified time corresponding to a tolerable level of average delay and it mostly depends on the airside characteristics, on the fleet mix variability and on the ATM. The adoption of the Directive 2002/30/EC in the EU countries drives the stakeholders to conceive airport capacity in a different way though. Airport capacity in this sense is fundamentally driven by environmental criteria, and since acoustical externalities represent the most important factors, those are the ones that could pose a serious threat to the growth of airports and to aviation market itself in the short-medium term. The importance of the regional airports in the deregulated market grew fast during the last decade since they represent spokes for network carriers and a preferential destination for low-fares carriers. Not only regional airports have witnessed a fast and unexpected growth in traffic but also a fast growth in the complaints for the nuisance by the people living near those airports. In this paper the results of a study conducted in cooperation with the airport of Bologna G. Marconi are presented in order to investigate airport acoustical capacity as a defacto constraint of airport growth.

Identification of Wideband Sources Using Higher Order Statistics in Noisy Environment

This paper deals with the localization of the wideband sources. We develop a new approach for estimating the wide band sources parameters. This method is based on the high order statistics of the recorded data in order to eliminate the Gaussian components from the signals received on the various hydrophones.In fact the noise of sea bottom is regarded as being Gaussian. Thanks to the coherent signal subspace algorithm based on the cumulant matrix of the received data instead of the cross-spectral matrix the wideband correlated sources are perfectly located in the very noisy environment. We demonstrate the performance of the proposed algorithm on the real data recorded during an underwater acoustics experiments.

Shear-Layer Instabilities of a Pulsed Stack-Issued Transverse Jet

Shear-layer instabilities of a pulsed stack-issued transverse jet were studied experimentally in a wind tunnel. Jet pulsations were induced by means of acoustic excitation. Streak pictures of the smoke-flow patterns illuminated by the laser-light sheet in the median plane were recorded with a high-speed digital camera. Instantaneous velocities of the shear-layer instabilities in the flow were digitized by a hot-wire anemometer. By analyzing the streak pictures of the smoke-flow visualization, three characteristic flow modes, synchronized flapping jet, transition, and synchronized shear-layer vortices, are identified in the shear layer of the pulsed stack-issued transverse jet at various excitation Strouhal numbers. The shear-layer instabilities of the pulsed stack-issued transverse jet are synchronized by acoustic excitation except for transition mode. In transition flow mode, the shear-layer vortices would exhibit a frequency that would be twice as great as the acoustic excitation frequency.

Distinguishing Innocent Murmurs from Murmurs caused by Aortic Stenosis by Recurrence Quantification Analysis

It is sometimes difficult to differentiate between innocent murmurs and pathological murmurs during auscultation. In these difficult cases, an intelligent stethoscope with decision support abilities would be of great value. In this study, using a dog model, phonocardiographic recordings were obtained from 27 boxer dogs with various degrees of aortic stenosis (AS) severity. As a reference for severity assessment, continuous wave Doppler was used. The data were analyzed with recurrence quantification analysis (RQA) with the aim to find features able to distinguish innocent murmurs from murmurs caused by AS. Four out of eight investigated RQA features showed significant differences between innocent murmurs and pathological murmurs. Using a plain linear discriminant analysis classifier, the best pair of features (recurrence rate and entropy) resulted in a sensitivity of 90% and a specificity of 88%. In conclusion, RQA provide valid features which can be used for differentiation between innocent murmurs and murmurs caused by AS.

Tool Failure Detection Based on Statistical Analysis of Metal Cutting Acoustic Emission Signals

The analysis of Acoustic Emission (AE) signal generated from metal cutting processes has often approached statistically. This is due to the stochastic nature of the emission signal as a result of factors effecting the signal from its generation through transmission and sensing. Different techniques are applied in this manner, each of which is suitable for certain processes. In metal cutting where the emission generated by the deformation process is rather continuous, an appropriate method for analysing the AE signal based on the root mean square (RMS) of the signal is often used and is suitable for use with the conventional signal processing systems. The aim of this paper is to set a strategy in tool failure detection in turning processes via the statistic analysis of the AE generated from the cutting zone. The strategy is based on the investigation of the distribution moments of the AE signal at predetermined sampling. The skews and kurtosis of these distributions are the key elements in the detection. A normal (Gaussian) distribution has first been suggested then this was eliminated due to insufficiency. The so called Beta distribution was then considered, this has been used with an assumed β density function and has given promising results with regard to chipping and tool breakage detection.

A New Vector Quantization Front-End Process for Discrete HMM Speech Recognition System

The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when the estimation of HMM parameters is performed. The DVQ technique is implemented through two variants. The first variant uses the K-means algorithm (K-means- DVQ) to optimize the VQ, while the second variant exploits the benefits of the classification behavior of neural networks (NN-DVQ) for the same purpose. The proposed variants are compared with the HMM-based baseline system by experiments of specific Arabic consonants recognition. The results show that the distributed vector quantization technique increase the performance of the discrete HMM system.

Convection through Light Weight Timber Constructions with Mineral Wool

The major part of light weight timber constructions consists of insulation. Mineral wool is the most commonly used insulation due to its cost efficiency and easy handling. The fiber orientation and porosity of this insulation material enables flowthrough. The air flow resistance is low. If leakage occurs in the insulated bay section, the convective flow may cause energy losses and infiltration of the exterior wall with moisture and particles. In particular the infiltrated moisture may lead to thermal bridges and growth of health endangering mould and mildew. In order to prevent this problem, different numerical calculation models have been developed. All models developed so far have a potential for completion. The implementation of the flow-through properties of mineral wool insulation may help to improve the existing models. Assuming that the real pressure difference between interior and exterior surface is larger than the prescribed pressure difference in the standard test procedure for mineral wool ISO 9053 / EN 29053, measurements were performed using the measurement setup for research on convective moisture transfer “MSRCMT". These measurements show, that structural inhomogeneities of mineral wool effect the permeability only at higher pressure differences, as applied in MSRCMT. Additional microscopic investigations show, that the location of a leak within the construction has a crucial influence on the air flow-through and the infiltration rate. The results clearly indicate that the empirical values for the acoustic resistance of mineral wool should not be used for the calculation of convective transfer mechanisms.

Time-Delay Estimation Using Cross-ΨB-Energy Operator

In this paper, a new time-delay estimation technique based on the cross IB-energy operator [5] is introduced. This quadratic energy detector measures how much a signal is present in another one. The location of the peak of the energy operator, corresponding to the maximum of interaction between the two signals, is the estimate of the delay. The method is a fully data-driven approach. The discrete version of the continuous-time form of the cross IBenergy operator, for its implementation, is presented. The effectiveness of the proposed method is demonstrated on real underwater acoustic signals arriving from targets and the results compared to the cross-correlation method.

Towards Finite Element Modeling of the Accoustics of Human Head

In this paper, a new formulation for acoustics coupled with linear elasticity is presented. The primary objective of the work is to develop a three dimensional hp adaptive finite element method code destinated for modeling of acoustics of human head. The code will have numerous applications e.g. in designing hearing protection devices for individuals working in high noise environments. The presented work is in the preliminary stage. The variational formulation has been implemented and tested on a sequence of meshes with concentric multi-layer spheres, with material data representing the tissue (the brain), skull and the air. Thus, an efficient solver for coupled elasticity/acoustics problems has been developed, and tested on high contrast material data representing the human head.

Amplification of Compression Waves in Clean and Bubbly Liquid

The theoretical investigation is carried out to describe the effect of increase of pressure waves amplitude in clean and bubbly liquid. The goal of the work is to capture the regime of multiple magnification of acoustic and shock waves in the liquid, which enables to get appropriate conditions to enlarge collapses of micro-bubbles. The influence of boundary conditions and frequency of the governing acoustic field is studied for the case of the cylindrical acoustic resonator. It has been observed the formation of standing waves with large amplitude at resonant frequencies. The interaction of the compression wave with gas and vapor bubbles is investigated for the convergent channel. It is shown theoretically that the chemical reactions, which occur inside gas bubbles, provide additional impulse to the wave, that affect strongly on the collapses of the vapor bubbles

Using Teager Energy Cepstrum and HMM distancesin Automatic Speech Recognition and Analysis of Unvoiced Speech

In this study, the use of silicon NAM (Non-Audible Murmur) microphone in automatic speech recognition is presented. NAM microphones are special acoustic sensors, which are attached behind the talker-s ear and can capture not only normal (audible) speech, but also very quietly uttered speech (non-audible murmur). As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech conversion etc.) for sound-impaired people. Using a small amount of training data and adaptation approaches, 93.9% word accuracy was achieved for a 20k Japanese vocabulary dictation task. Non-audible murmur recognition in noisy environments is also investigated. In this study, further analysis of the NAM speech has been made using distance measures between hidden Markov model (HMM) pairs. It has been shown the reduced spectral space of NAM speech using a metric distance, however the location of the different phonemes of NAM are similar to the location of the phonemes of normal speech, and the NAM sounds are well discriminated. Promising results in using nonlinear features are also introduced, especially under noisy conditions.

Sperm Whale Signal Analysis: Comparison using the Auto Regressive model and the Daubechies 15 Wavelets Transform

This article presents the results using a parametric approach and a Wavelet Transform in analysing signals emitting from the sperm whale. The extraction of intrinsic characteristics of these unique signals emitted by marine mammals is still at present a difficult exercise for various reasons: firstly, it concerns non-stationary signals, and secondly, these signals are obstructed by interfering background noise. In this article, we compare the advantages and disadvantages of both methods: Auto Regressive models and Wavelet Transform. These approaches serve as an alternative to the commonly used estimators which are based on the Fourier Transform for which the hypotheses necessary for its application are in certain cases, not sufficiently proven. These modern approaches provide effective results particularly for the periodic tracking of the signal's characteristics and notably when the signal-to-noise ratio negatively effects signal tracking. Our objectives are twofold. Our first goal is to identify the animal through its acoustic signature. This includes recognition of the marine mammal species and ultimately of the individual animal (within the species). The second is much more ambitious and directly involves the intervention of cetologists to study the sounds emitted by marine mammals in an effort to characterize their behaviour. We are working on an approach based on the recordings of marine mammal signals and the findings from this data result from the Wavelet Transform. This article will explore the reasons for using this approach. In addition, thanks to the use of new processors, these algorithms once heavy in calculation time can be integrated in a real-time system.

A Preliminary Study of Drug Perfusion Enhancement by Microstreaming Induced by an Oscillating Microbubble

Microbubbbles incorporating ultrasound have been used to increase the efficacy of targeted drug delivery, because microstreaming induced by cavitating bubbles affects the drug perfusion into the target cells and tissues. In order to clarify the physical effects of microstreaming on drug perfusion into tissues, a preliminary experimental study of perfusion enhancement by a stably oscillating microbubble was performed. Microstreaming was induced by an oscillating bubble at 15 kHz, and perfusion of dye into an agar phantom was optically measured by histology on agar phantom. Surface color intensity and the penetration length of dye in the agar phantom were increased more than 70% and 30%, respectively, due to the microstreaming induced by an oscillating bubble. The mass of dye perfused into a tissue phantom for 30 s was increased about 80% in the phantom with an oscillating bubble. This preliminary experiment shows the physical effects of steady streaming by an oscillating bubble can enhance the drug perfusion into the tissues while minimizing the biological effects.

Continuous Feature Adaptation for Non-Native Speech Recognition

The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation.

Rarefactive and Compressive Solitons in Warm Dusty Plasma with Electrons and Nonthermal Ions

Dust acoustic solitary waves are studied in warm dusty plasma containing negatively charged dusts, nonthermal ions and Boltzmann distributed electrons. Sagdeev pseudopotential method is used in order to investigate solitary wave solutions in the plasmas. The existence of compressive and rarefractive solitons is studied.

Mathematical Model of the Respiratory System – Comparison of the Total Lung Impedance in the Adult and Neonatal Lung

A mathematical model of the respiratory system is introduced in this study. Geometrical dimensions of the respiratory system were used to compute the acoustic properties of the respiratory system using the electro-acoustic analogy. The effect of the geometrical proportions of the respiratory system is observed in the paper.

A Family of Affine Projection Adaptive Filtering Algorithms With Selective Regressors

In this paper we present a general formalism for the establishment of the family of selective regressor affine projection algorithms (SR-APA). The SR-APA, the SR regularized APA (SR-RAPA), the SR partial rank algorithm (SR-PRA), the SR binormalized data reusing least mean squares (SR-BNDR-LMS), and the SR normalized LMS with orthogonal correction factors (SR-NLMS-OCF) algorithms are established by this general formalism. We demonstrate the performance of the presented algorithms through simulations in acoustic echo cancellation scenario.

Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain

This work presents a fusion of Log Gabor Wavelet (LGW) and Maximum a Posteriori (MAP) estimator as a speech enhancement tool for acoustical background noise reduction. The probability density function (pdf) of the speech spectral amplitude is approximated by a Generalized Laplacian Distribution (GLD). Compared to earlier estimators the proposed method estimates the underlying statistical model more accurately by appropriately choosing the model parameters of GLD. Experimental results show that the proposed estimator yields a higher improvement in Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral Distortion (LSD) in two different noisy environments compared to other estimators.