Abstract: In this paper, a new encoding algorithm of spectral envelope based on NLMS in G.729.1 for VoIP is proposed. In the TDAC part of G.729.1, the spectral envelope and MDCT coefficients extracted in the weighted CELP coding error (lower-band) and the higher-band input signal are encoded. In order to reduce allocation bits for spectral envelope coding, a new quantization algorithm based on NLMS is proposed. Also, reduced bits are used to enhance sound quality. The performance of the proposed algorithm is evaluated by sound quality and bit reduction rates in clean and frame loss conditions.
Abstract: Vector quantization is a powerful tool for speech
coding applications. This paper deals with LPC Coding of speech
signals which uses a new technique called Multi Switched Split
Vector Quantization (MSSVQ), which is a hybrid of Multi, switched,
split vector quantization techniques. The spectral distortion
performance, computational complexity, and memory requirements
of MSSVQ are compared to split vector quantization (SVQ), multi
stage vector quantization(MSVQ) and switched split vector
quantization (SSVQ) techniques. It has been proved from results that
MSSVQ has better spectral distortion performance, lower
computational complexity and lower memory requirements when
compared to all the above mentioned product code vector
quantization techniques. Computational complexity is measured in
floating point operations (flops), and memory requirements is
measured in (floats).
Abstract: While compressing text files is useful, compressing
still image files is almost a necessity. A typical image takes up much
more storage than a typical text message and without compression
images would be extremely clumsy to store and distribute. The
amount of information required to store pictures on modern
computers is quite large in relation to the amount of bandwidth
commonly available to transmit them over the Internet and
applications. Image compression addresses the problem of reducing
the amount of data required to represent a digital image. Performance
of any image compression method can be evaluated by measuring the
root-mean-square-error & peak signal to noise ratio. The method of
image compression that will be analyzed in this paper is based on the
lossy JPEG image compression technique, the most popular
compression technique for color images. JPEG compression is able to
greatly reduce file size with minimal image degradation by throwing
away the least “important" information. In JPEG, both color
components are downsampled simultaneously, but in this paper we
will compare the results when the compression is done by
downsampling the single chroma part. In this paper we will
demonstrate more compression ratio is achieved when the
chrominance blue is downsampled as compared to downsampling the
chrominance red in JPEG compression. But the peak signal to noise
ratio is more when the chrominance red is downsampled as compared
to downsampling the chrominance blue in JPEG compression. In
particular we will use the hats.jpg as a demonstration of JPEG
compression using low pass filter and demonstrate that the image is
compressed with barely any visual differences with both methods.
Abstract: In this paper, we propose novel algorithmic models
based on information fusion and feature transformation in crossmodal
subspace for different types of residue features extracted from
several intra-frame and inter-frame pixel sub-blocks in video
sequences for detecting digital video tampering or forgery. An
evaluation of proposed residue features – the noise residue features
and the quantization features, their transformation in cross-modal
subspace, and their multimodal fusion, for emulated copy-move
tamper scenario shows a significant improvement in tamper detection
accuracy as compared to single mode features without transformation
in cross-modal subspace.
Abstract: This paper presents a very simple and efficient
algorithm for codebook search, which reduces a great deal of
computation as compared to the full codebook search. The algorithm
is based on sorting and centroid technique for search. The results
table shows the effectiveness of the proposed algorithm in terms of
computational complexity. In this paper we also introduce a new
performance parameter named as Average fractional change in pixel
value as we feel that it gives better understanding of the closeness of
the image since it is related to the perception. This new performance
parameter takes into consideration the average fractional change in
each pixel value.
Abstract: In this paper, a watermarking algorithm that uses the wavelet transform with Multiple Description Coding (MDC) and Quantization Index Modulation (QIM) concepts is introduced. Also, the paper investigates the role of Contourlet Transform (CT) versus Wavelet Transform (WT) in providing robust image watermarking. Two measures are utilized in the comparison between the waveletbased and the contourlet-based methods; Peak Signal to Noise Ratio (PSNR) and Normalized Cross-Correlation (NCC). Experimental results reveal that the introduced algorithm is robust against different attacks and has good results compared to the contourlet-based algorithm.
Abstract: Color Image quantization (CQ) is an important
problem in computer graphics, image and processing. The aim of
quantization is to reduce colors in an image with minimum distortion.
Clustering is a widely used technique for color quantization; all
colors in an image are grouped to small clusters. In this paper, we
proposed a new hybrid approach for color quantization using firefly
algorithm (FA) and K-means algorithm. Firefly algorithm is a swarmbased
algorithm that can be used for solving optimization problems.
The proposed method can overcome the drawbacks of both
algorithms such as the local optima converge problem in K-means
and the early converge of firefly algorithm. Experiments on three
commonly used images and the comparison results shows that the
proposed algorithm surpasses both the base-line technique k-means
clustering and original firefly algorithm.
Abstract: Wavelet transforms is a very powerful tools for image compression. One of its advantage is the provision of both spatial and frequency localization of image energy. However, wavelet transform coefficients are defined by both a magnitude and sign. While algorithms exist for efficiently coding the magnitude of the transform coefficients, they are not efficient for the coding of their sign. It is generally assumed that there is no compression gain to be obtained from the coding of the sign. Only recently have some authors begun to investigate the sign of wavelet coefficients in image coding. Some authors have assumed that the sign information bit of wavelet coefficients may be encoded with the estimated probability of 0.5; the same assumption concerns the refinement information bit. In this paper, we propose a new method for Separate Sign Coding (SSC) of wavelet image coefficients. The sign and the magnitude of wavelet image coefficients are examined to obtain their online probabilities. We use the scalar quantization in which the information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also examined. We show that the sign information and the refinement information may be encoded by the probability of approximately 0.5 only after about five bit planes. Two maps are separately entropy encoded: the sign map and the magnitude map. The refinement information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also entropy encoded. An algorithm is developed and simulations are performed on three standard images in grey scale: Lena, Barbara and Cameraman. Five scales are performed using the biorthogonal wavelet transform 9/7 filter bank. The obtained results are compared to JPEG2000 standard in terms of peak signal to noise ration (PSNR) for the three images and in terms of subjective quality (visual quality). It is shown that the proposed method outperforms the JPEG2000. The proposed method is also compared to other codec in the literature. It is shown that the proposed method is very successful and shows its performance in term of PSNR.
Abstract: The SOM has several beneficial features which make
it a useful method for data mining. One of the most important
features is the ability to preserve the topology in the projection.
There are several measures that can be used to quantify the goodness
of the map in order to obtain the optimal projection, including the
average quantization error and many topological errors. Many
researches have studied how the topology preservation should be
measured. One option consists of using the topographic error which
considers the ratio of data vectors for which the first and second best
BMUs are not adjacent. In this work we present a study of the
behaviour of the topographic error in different kinds of maps. We
have found that this error devaluates the rectangular maps and we
have studied the reasons why this happens. Finally, we suggest a new
topological error to improve the deficiency of the topographic error.
Abstract: In this paper, we present an improved fast and robust
search algorithm for copy detection using histogram-based features for
short MPEG video clips from large video database. There are two
types of histogram features used to generate more robust features. The
first one is based on the adjacent pixel intensity difference quantization
(APIDQ) algorithm, which had been reliably applied to human face
recognition previously. An APIDQ histogram is utilized as the feature
vector of the frame image. Another one is ordinal histogram feature
which is robust to color distortion. Furthermore, by Combining with a
temporal division method, the spatial and temporal features of the
video sequence are integrated to realize fast and robust video search
for copy detection. Experimental results show the proposed algorithm
can detect the similar video clip more accurately and robust than
conventional fast video search algorithm.
Abstract: In this paper, we propose an improved fast search
algorithm using combined histogram features and temporal division
method for short MPEG video clips from large video database. There
are two types of histogram features used to generate more robust
features. The first one is based on the adjacent pixel intensity
difference quantization (APIDQ) algorithm, which had been reliably
applied to human face recognition previously. An APIDQ histogram is
utilized as the feature vector of the frame image. Another one is
ordinal feature which is robust to color distortion. Combined with
active search [4], a temporal pruning algorithm, fast and robust video
search can be realized. The proposed search algorithm has been
evaluated by 6 hours of video to search for given 200 MPEG video
clips which each length is 30 seconds. Experimental results show the
proposed algorithm can detect the similar video clip in merely 120ms,
and Equal Error Rate (ERR) of 1% is achieved, which is more
accurately and robust than conventional fast video search algorithm.
Abstract: Nowadays, with the emerging of the new applications
like robot control in image processing, artificial vision for visual
servoing is a rapidly growing discipline and Human-machine
interaction plays a significant role for controlling the robot. This
paper presents a new algorithm based on spatio-temporal volumes for
visual servoing aims to control robots. In this algorithm, after
applying necessary pre-processing on video frames, a spatio-temporal
volume is constructed for each gesture and feature vector is extracted.
These volumes are then analyzed for matching in two consecutive
stages. For hand gesture recognition and classification we tested
different classifiers including k-Nearest neighbor, learning vector
quantization and back propagation neural networks. We tested the
proposed algorithm with the collected data set and results showed the
correct gesture recognition rate of 99.58 percent. We also tested the
algorithm with noisy images and algorithm showed the correct
recognition rate of 97.92 percent in noisy images.
Abstract: In this paper, a new algorithm for generating codebook is proposed for vector quantization (VQ) in image coding. The significant features of the training image vectors are extracted by using the proposed Orthogonal Polynomials based transformation. We propose to generate the codebook by partitioning these feature vectors into a binary tree. Each feature vector at a non-terminal node of the binary tree is directed to one of the two descendants by comparing a single feature associated with that node to a threshold. The binary tree codebook is used for encoding and decoding the feature vectors. In the decoding process the feature vectors are subjected to inverse transformation with the help of basis functions of the proposed Orthogonal Polynomials based transformation to get back the approximated input image training vectors. The results of the proposed coding are compared with the VQ using Discrete Cosine Transform (DCT) and Pairwise Nearest Neighbor (PNN) algorithm. The new algorithm results in a considerable reduction in computation time and provides better reconstructed picture quality.
Abstract: The scientific community has invested a great deal of effort in the fields of discrete wavelet transform in the last few decades. Discrete wavelet transform (DWT) associated with the vector quantization has been proved to be a very useful tool for the compression of image. However, the DWT is very computationally intensive process requiring innovative and computationally efficient method to obtain the image compression. The concurrent transformation of the image can be an important solution to this problem. This paper proposes a model of concurrent DWT for image compression. Additionally, the formal verification of the model has also been performed. Here the Symbolic Model Verifier (SMV) has been used as the formal verification tool. The system has been modeled in SMV and some properties have been verified formally.
Abstract: This paper presents a vocoder to obtain high quality synthetic speech at 600 bps. To reduce the bit rate, the algorithm is based on a sinusoidally excited linear prediction model which extracts few coding parameters, and three consecutive frames are grouped into a superframe and jointly vector quantization is used to obtain high coding efficiency. The inter-frame redundancy is exploited with distinct quantization schemes for different unvoiced/voiced frame combinations in the superframe. Experimental results show that the quality of the proposed coder is better than that of 2.4kbps LPC10e and achieves approximately the same as that of 2.4kbps MELP and with high robustness.
Abstract: The goal of this project is to design a system to
recognition voice commands. Most of voice recognition systems
contain two main modules as follow “feature extraction" and “feature
matching". In this project, MFCC algorithm is used to simulate
feature extraction module. Using this algorithm, the cepstral
coefficients are calculated on mel frequency scale. VQ (vector
quantization) method will be used for reduction of amount of data to
decrease computation time. In the feature matching stage Euclidean
distance is applied as similarity criterion. Because of high accuracy
of used algorithms, the accuracy of this voice command system is
high. Using these algorithms, by at least 5 times repetition for each
command, in a single training session, and then twice in each testing
session zero error rate in recognition of commands is achieved.
Abstract: This paper investigates the performance of a speech
recognizer in an interactive voice response system for various coded
speech signals, coded by using a vector quantization technique namely
Multi Switched Split Vector Quantization Technique. The process of
recognizing the coded output can be used in Voice banking application.
The recognition technique used for the recognition of the coded speech
signals is the Hidden Markov Model technique. The spectral distortion
performance, computational complexity, and memory requirements of
Multi Switched Split Vector Quantization Technique and the
performance of the speech recognizer at various bit rates have been
computed. From results it is found that the speech recognizer is
showing better performance at 24 bits/frame and it is found that the
percentage of recognition is being varied from 100% to 93.33% for
various bit rates.
Abstract: This paper describes a new supervised fusion (hybrid)
electrocardiogram (ECG) classification solution consisting of a new
QRS complex geometrical feature extraction as well as a new version
of the learning vector quantization (LVQ) classification algorithm
aimed for overcoming the stability-plasticity dilemma. Toward this
objective, after detection and delineation of the major events of ECG
signal via an appropriate algorithm, each QRS region and also its
corresponding discrete wavelet transform (DWT) are supposed as
virtual images and each of them is divided into eight polar sectors.
Then, the curve length of each excerpted segment is calculated
and is used as the element of the feature space. To increase the
robustness of the proposed classification algorithm versus noise,
artifacts and arrhythmic outliers, a fusion structure consisting of
five different classifiers namely as Support Vector Machine (SVM),
Modified Learning Vector Quantization (MLVQ) and three Multi
Layer Perceptron-Back Propagation (MLP–BP) neural networks with
different topologies were designed and implemented. The new proposed
algorithm was applied to all 48 MIT–BIH Arrhythmia Database
records (within–record analysis) and the discrimination power of the
classifier in isolation of different beat types of each record was
assessed and as the result, the average accuracy value Acc=98.51%
was obtained. Also, the proposed method was applied to 6 number
of arrhythmias (Normal, LBBB, RBBB, PVC, APB, PB) belonging
to 20 different records of the aforementioned database (between–
record analysis) and the average value of Acc=95.6% was achieved.
To evaluate performance quality of the new proposed hybrid learning
machine, the obtained results were compared with similar peer–
reviewed studies in this area.
Abstract: The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when the estimation of HMM parameters is performed. The DVQ technique is implemented through two variants. The first variant uses the K-means algorithm (K-means- DVQ) to optimize the VQ, while the second variant exploits the benefits of the classification behavior of neural networks (NN-DVQ) for the same purpose. The proposed variants are compared with the HMM-based baseline system by experiments of specific Arabic consonants recognition. The results show that the distributed vector quantization technique increase the performance of the discrete HMM system.
Abstract: In this paper, we present local image descriptor using
VQ-SIFT for more effective and efficient image retrieval. Instead of
SIFT's weighted orientation histograms, we apply vector quantization
(VQ) histogram as an alternate representation for SIFT features.
Experimental results show that SIFT features using VQ-based local
descriptors can achieve better image retrieval accuracy than the
conventional algorithm while the computational cost is significantly
reduced.