Abstract: Vector quantization is a powerful tool for speech
coding applications. This paper deals with LPC Coding of speech
signals which uses a new technique called Multi Switched Split
Vector Quantization (MSSVQ), which is a hybrid of Multi, switched,
split vector quantization techniques. The spectral distortion
performance, computational complexity, and memory requirements
of MSSVQ are compared to split vector quantization (SVQ), multi
stage vector quantization(MSVQ) and switched split vector
quantization (SSVQ) techniques. It has been proved from results that
MSSVQ has better spectral distortion performance, lower
computational complexity and lower memory requirements when
compared to all the above mentioned product code vector
quantization techniques. Computational complexity is measured in
floating point operations (flops), and memory requirements is
measured in (floats).
Abstract: We analyze the effectivity of different pseudo noise (PN) and orthogonal sequences for encrypting speech signals in terms of perceptual intelligence. Speech signal can be viewed as sequence of correlated samples and each sample as sequence of bits. The residual intelligibility of the speech signal can be reduced by removing the correlation among the speech samples. PN sequences have random like properties that help in reducing the correlation among speech samples. The mean square aperiodic auto-correlation (MSAAC) and the mean square aperiodic cross-correlation (MSACC) measures are used to test the randomness of the PN sequences. Results of the investigation show the effectivity of large Kasami sequences for this purpose among many PN sequences.
Abstract: This paper is taken into consideration the problem of cryptanalysis of stream ciphers. There is some attempts need to improve the existing attacks on stream cipher and to make an attempt to distinguish the portions of cipher text obtained by the encryption of plain text in which some parts of the text are random and the rest are non-random. This paper presents a tutorial introduction to symmetric cryptography. The basic information theoretic and computational properties of classic and modern cryptographic systems are presented, followed by an examination of the application of cryptography to the security of VoIP system in computer networks using LFSR algorithm. The implementation program will be developed Java 2. LFSR algorithm is appropriate for the encryption and decryption of online streaming data, e.g. VoIP (voice chatting over IP). This paper is implemented the encryption module of speech signals to cipher text and decryption module of cipher text to speech signals.
Abstract: We present a novel scheme to recognize isolated speech
signals using certain statistical parameters derived from those signals.
The determination of the statistical estimates is based on extracted
signal information rather than the original signal information in
order to reduce the computational complexity. Subtle details of
these estimates, after extracting the speech signal from ambience
noise, are first exploited to segregate the polysyllabic words from
the monosyllabic ones. Precise recognition of each distinct word is
then carried out by analyzing the histogram, obtained from these
information.
Abstract: In real-field applications, the correct determination of voice segments highly improves the overall system accuracy and minimises the total computation time. This paper presents reliable measures of speech compression by detcting the end points of the speech signals prior to compressing them. The two different compession schemes used are the Global threshold and the Level- Dependent threshold techniques. The performance of the proposed method is tested wirh the Signal to Noise Ratios, Peak Signal to Noise Ratios and Normalized Root Mean Square Error parameter measures.
Abstract: This paper presents a formant-tracking linear prediction
(FTLP) model for speech processing in noise. The main focus of this
work is the detection of formant trajectory based on Hidden Markov
Models (HMM), for improved formant estimation in noise. The
approach proposed in this paper provides a systematic framework for
modelling and utilization of a time- sequence of peaks which satisfies
continuity constraints on parameter; the within peaks are modelled
by the LP parameters. The formant tracking LP model estimation
is composed of three stages: (1) a pre-cleaning multi-band spectral
subtraction stage to reduce the effect of residue noise on formants
(2) estimation stage where an initial estimate of the LP model of
speech for each frame is obtained (3) a formant classification using
probability models of formants and Viterbi-decoders. The evaluation
results for the estimation of the formant tracking LP model tested
in Gaussian white noise background, demonstrate that the proposed
combination of the initial noise reduction stage with formant tracking
and LPC variable order analysis, results in a significant reduction in
errors and distortions. The performance was evaluated with noisy
natual vowels extracted from international french and English vocabulary
speech signals at SNR value of 10dB. In each case, the
estimated formants are compared to reference formants.
Abstract: In this paper in consideration of each available
techniques deficiencies for speech recognition, an advanced method
is presented that-s able to classify speech signals with the high
accuracy (98%) at the minimum time. In the presented method, first,
the recorded signal is preprocessed that this section includes
denoising with Mels Frequency Cepstral Analysis and feature
extraction using discrete wavelet transform (DWT) coefficients; Then
these features are fed to Multilayer Perceptron (MLP) network for
classification. Finally, after training of neural network effective
features are selected with UTA algorithm.
Abstract: Preprocessing of speech signals is considered a crucial step in the development of a robust and efficient speech or speaker recognition system. In this paper, we present some popular statistical outlier-detection based strategies to segregate the silence/unvoiced part of the speech signal from the voiced portion. The proposed methods are based on the utilization of the 3 σ edit rule, and the Hampel Identifier which are compared with the conventional techniques: (i) short-time energy (STE) based methods, and (ii) distribution based methods. The results obtained after applying the proposed strategies on some test voice signals are encouraging.
Abstract: This paper investigates the performance of a speech
recognizer in an interactive voice response system for various coded
speech signals, coded by using a vector quantization technique namely
Multi Switched Split Vector Quantization Technique. The process of
recognizing the coded output can be used in Voice banking application.
The recognition technique used for the recognition of the coded speech
signals is the Hidden Markov Model technique. The spectral distortion
performance, computational complexity, and memory requirements of
Multi Switched Split Vector Quantization Technique and the
performance of the speech recognizer at various bit rates have been
computed. From results it is found that the speech recognizer is
showing better performance at 24 bits/frame and it is found that the
percentage of recognition is being varied from 100% to 93.33% for
various bit rates.
Abstract: This work presents a novel means of extracting fixedlength parameters from voice signals, such that words can be recognized
in linear time. The power and the zero crossing rate are first
calculated segment by segment from a voice signal; by doing so, two
feature sequences are generated. We then construct an FIR system
across these two sequences. The parameters of this FIR system, used
as the input of a multilayer proceptron recognizer, can be derived by
recursive LSE (least-square estimation), implying that the complexity of overall process is linear to the signal size. In the second part of
this work, we introduce a weighting factor λ to emphasize recent
input; therefore, we can further recognize continuous speech signals.
Experiments employ the voice signals of numbers, from zero to nine, spoken in Mandarin Chinese. The proposed method is verified to
recognize voice signals efficiently and accurately.
Abstract: Independent component analysis (ICA) in the
frequency domain is used for solving the problem of blind source
separation (BSS). However, this method has some problems. For
example, a general ICA algorithm cannot determine the permutation
of signals which is important in the frequency domain ICA. In this
paper, we propose an approach to the solution for a permutation
problem. The idea is to effectively combine two conventional
approaches. This approach improves the signal separation
performance by exploiting features of the conventional approaches.
We show the simulation results using artificial data.
Abstract: In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.
Abstract: In this paper, we propose a novel time-frequency distribution (TFD) for the analysis of multi-component signals. In particular, we use synthetic as well as real-life speech signals to prove the superiority of the proposed TFD in comparison to some existing ones. In the comparison, we consider the cross-terms suppression and the high energy concentration of the signal around its instantaneous frequency (IF).