Adaptive Motion Estimator Based on Variable Block Size Scheme

This paper presents an adaptive motion estimator that can be dynamically reconfigured by the best algorithm depending on the variation of the video nature during the lifetime of an application under running. The 4 Step Search (4SS) and the Gradient Search (GS) algorithms are integrated in the estimator in order to be used in the case of rapid and slow video sequences respectively. The Full Search Block Matching (FSBM) algorithm has been also integrated in order to be used in the case of the video sequences which are not real time oriented. In order to efficiently reduce the computational cost while achieving better visual quality with low cost power, the proposed motion estimator is based on a Variable Block Size (VBS) scheme that uses only the 16x16, 16x8, 8x16 and 8x8 modes. Experimental results show that the adaptive motion estimator allows better results in term of Peak Signal to Noise Ratio (PSNR), computational cost, FPGA occupied area, and dissipated power relatively to the most popular variable block size schemes presented in the literature.

Low Power and Less Area Architecture for Integer Motion Estimation

Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.

Fingerprint Compression Using Contourlet Transform and Multistage Vector Quantization

This paper presents a new fingerprint coding technique based on contourlet transform and multistage vector quantization. Wavelets have shown their ability in representing natural images that contain smooth areas separated with edges. However, wavelets cannot efficiently take advantage of the fact that the edges usually found in fingerprints are smooth curves. This issue is addressed by directional transforms, known as contourlets, which have the property of preserving edges. The contourlet transform is a new extension to the wavelet transform in two dimensions using nonseparable and directional filter banks. The computation and storage requirements are the major difficulty in implementing a vector quantizer. In the full-search algorithm, the computation and storage complexity is an exponential function of the number of bits used in quantizing each frame of spectral information. The storage requirement in multistage vector quantization is less when compared to full search vector quantization. The coefficients of contourlet transform are quantized by multistage vector quantization. The quantized coefficients are encoded by Huffman coding. The results obtained are tabulated and compared with the existing wavelet based ones.

A Pipelined FSBM Hardware Architecture for HTDV-H.26x

In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.

Efficient Block Matching Algorithm for Motion Estimation

Motion estimation is a key problem in video processing and computer vision. Optical flow motion estimation can achieve high estimation accuracy when motion vector is small. Three-step search algorithm can handle large motion vector but not very accurate. A joint algorithm was proposed in this paper to achieve high estimation accuracy disregarding whether the motion vector is small or large, and keep the computation cost much lower than full search.

Performance Enhancement of Motion Estimation Using SSE2 Technology

Motion estimation is the most computationally intensive part in video processing. Many fast motion estimation algorithms have been proposed to decrease the computational complexity by reducing the number of candidate motion vectors. However, these studies are for fast search algorithms themselves while almost image and video compressions are operated with software based. Therefore, the timing constraints for running these motion estimation algorithms not only challenge for the video codec but also overwhelm for some of processors. In this paper, the performance of motion estimation is enhanced by using Intel's Streaming SIMD Extension 2 (SSE2) technology with Intel Pentium 4 processor.