Abstract: Mammals are known to use Interaural Intensity Difference (IID) to determine azimuthal position of high frequency sounds. In the Lateral Superior Olive (LSO) neurons have firing behaviours which vary systematicaly with IID. Those neurons receive excitatory inputs from the ipsilateral ear and inhibitory inputs from the contralateral one. The IID sensitivity of a LSO neuron is thought to be due to delay differences between both ears, delays due to different synaptic delays and to intensity-dependent delays. In this paper we model the auditory pathway until the LSO. Inputs to LSO neurons are at first numerous and differ in their relative delays. Spike Timing-Dependent Plasticity is then used to prune those connections. We compare the pruned neuron responses with physiological data and analyse the relationship between IID-s of teacher stimuli and IID sensitivities of trained LSO neurons.
Abstract: In this study, the use of silicon NAM (Non-Audible
Murmur) microphone in automatic speech recognition is presented.
NAM microphones are special acoustic sensors, which are attached
behind the talker-s ear and can capture not only normal (audible)
speech, but also very quietly uttered speech (non-audible murmur).
As a result, NAM microphones can be applied in automatic speech
recognition systems when privacy is desired in human-machine communication.
Moreover, NAM microphones show robustness against
noise and they might be used in special systems (speech recognition,
speech conversion etc.) for sound-impaired people. Using a small
amount of training data and adaptation approaches, 93.9% word
accuracy was achieved for a 20k Japanese vocabulary dictation
task. Non-audible murmur recognition in noisy environments is also
investigated. In this study, further analysis of the NAM speech has
been made using distance measures between hidden Markov model
(HMM) pairs. It has been shown the reduced spectral space of NAM
speech using a metric distance, however the location of the different
phonemes of NAM are similar to the location of the phonemes
of normal speech, and the NAM sounds are well discriminated.
Promising results in using nonlinear features are also introduced,
especially under noisy conditions.
Abstract: This article presents the results using a parametric approach and a Wavelet Transform in analysing signals emitting from the sperm whale. The extraction of intrinsic characteristics of these unique signals emitted by marine mammals is still at present a difficult exercise for various reasons: firstly, it concerns non-stationary signals, and secondly, these signals are obstructed by interfering background noise. In this article, we compare the advantages and disadvantages of both methods: Auto Regressive models and Wavelet Transform. These approaches serve as an alternative to the commonly used estimators which are based on the Fourier Transform for which the hypotheses necessary for its application are in certain cases, not sufficiently proven. These modern approaches provide effective results particularly for the periodic tracking of the signal's characteristics and notably when the signal-to-noise ratio negatively effects signal tracking. Our objectives are twofold. Our first goal is to identify the animal through its acoustic signature. This includes recognition of the marine mammal species and ultimately of the individual animal (within the species). The second is much more ambitious and directly involves the intervention of cetologists to study the sounds emitted by marine mammals in an effort to characterize their behaviour. We are working on an approach based on the recordings of marine mammal signals and the findings from this data result from the Wavelet Transform. This article will explore the reasons for using this approach. In addition, thanks to the use of new processors, these algorithms once heavy in calculation time can be integrated in a real-time system.
Abstract: Recently, a quality of motors is inspected by human
ears. In this paper, I propose two systems using a method of speech
recognition for automation of the inspection. The first system is based
on a method of linear processing which uses K-means and Nearest
Neighbor method, and the second is based on a method of non-linear
processing which uses neural networks. I used motor sounds in these
systems, and I successfully recognize 86.67% of motor sounds in the
linear processing system and 97.78% in the non-linear processing
system.
Abstract: In this paper, we propose a new approach to query-by-humming, focusing on MP3 songs database. Since MP3 songs are much more difficult in melody representation than symbolic performance data, we adopt to extract feature descriptors from the vocal sounds part of the songs. Our approach is based on signal filtering, sub-band spectral processing, MDCT coefficients analysis and peak energy detection by ignorance of the background music as much as possible. Finally, we apply dual dynamic programming algorithm for feature similarity matching. Experiments will show us its online performance in precision and efficiency.
Abstract: Artificial Neural Network (ANN) has been
extensively used for classification of heart sounds for its
discriminative training ability and easy implementation. However, it
suffers from overparameterization if the number of nodes is not
chosen properly. In such cases, when the dataset has redundancy
within it, ANN is trained along with this redundant information that
results in poor validation. Also a larger network means more
computational expense resulting more hardware and time related
cost. Therefore, an optimum design of neural network is needed
towards real-time detection of pathological patterns, if any from heart
sound signal. The aims of this work are to (i) select a set of input
features that are effective for identification of heart sound signals and
(ii) make certain optimum selection of nodes in the hidden layer for a
more effective ANN structure. Here, we present an optimization
technique that involves Singular Value Decomposition (SVD) and
QR factorization with column pivoting (QRcp) methodology to
optimize empirically chosen over-parameterized ANN structure.
Input nodes present in ANN structure is optimized by SVD followed
by QRcp while only SVD is required to prune undesirable hidden
nodes. The result is presented for classifying 12 common
pathological cases and normal heart sound.