A Pipelined FSBM Hardware Architecture for HTDV-H.26x

In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.




References:
[1] G. Robert, "Représentation et codage de séquences vidéo par
hybridation de fractales et d-éléments finis," Thèse / PhD, INPG
Grenoble, 07 December 2000.
[2] S. Roux, "Adéquation algorithme - architecture pour le traitement
multimédia embarqué," Thèse / PhD, 22 January 2002, TIMA, Institut
National Polytechnique de Grenoble - INPG.
[3] S. Wong, B. Stougie, S. Cotofana, "An Investigation on FPGA based
SAD Hardware Implementations," in Proceedings of the 13th Annual
Workshop on Circuits, Systems and Signal Processing (ProRISC2002),
pp. 568-573, Veldhoven, The Netherlands, November 2002.
[4] Ja-Ling Wu, "Motion Estimation for Video Coding Standards,"
Department of Computer Science and Information Engineering, National
Taiwan University.
[5] A. Ben Atitallah, P. Kadionik, N. Masmoudi, H. Levi "HW/SW FPGA
Architecture for a Flexible Motion Estimation," IEEE ICECS '07,
Marrakech, Morocco, 11-14, December 2007.
[6] J. Zhang, Y. He, S. Yang, and Y. Zhong, "Performance and Complexity
Joint Optimization for H.264 Video Coding," Proceedings of the 2003
International Symposium on Circuits and Systems, Vol. 2. (2003) 888-
891.
[7] C. Zhu, X. Lin, and L. P. Chau, " Hexagon-Based Search Pattern for
Fast Block Motion Estimation ," IEEE Trans. On Circuits And Systs.
For Video Technology, vol. 12, pp. 349-355, May 2002.
[8] J. Y. Tham, S. Ranganath, M. Ranganath, and A. A. Kassim, "A novel
unrestricted center-biased diamond search algorithm for block motion
estimation ," IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp. 369-
377, Aug. 1998.
[9] L. K. Liu and E. Feig, "A block-based gradient descent search algorithm
for block motion estimation in video coding," IEEE Trans. Circuits Syst.
Video Technol., vol. 6, no. 4, pp. 419-423, Aug. 1996.
[10] A. Ben Atitallah, P. Kadionik, F. Ghozzi, P. Nouel, N. Masmoudi, Ph.
Marchegay, "Optimization and implementation on FPGA of the
DCT/IDCT algorithm ," IEEE ICASSP '06, Toulouse, France, 14-19 Mai
2006.
[11] T. Komarek, P. Pirsch, "Array architectures for block matching
algorithms," IEEE Transactions on Circuits and Systems, Vol. 36, No.
10, October 1989.
[12] K. M. Yang, M. T. Sun, L. Wu, "A family of vlsi designs for the motion
compensation block-matching algorithm," IEEE Transactions on
Circuits and Systems, Vol. 36, No. 10, October 1989.
[13] C. H. Hsieh et al., "Vlsi architecture for block-matching motion
estimation algorithm," IEEE Transactions on Circuits and Systems for
video technology, Vol. 2, No. 2, June 1992
[14] F. M. Yang, S. Wolter, R. Laur., "Parallel implementation of a blockmatching
algorithm for hdtv motion estimation," Workshop on Design
Methodologies for Microelectronicx and Signal Processing, pp. 73-80.
October 1993.
[15] Y. S. Jehng, L. G. Chen, T. D. Chiueh, "An efficient and simple vlsi tree
architecture for motion estimation Algorithms," IEEE Transactions on
signal processing, Vol. 41, No. 2, October 1993.
[16] H. Yeo et al., "A novel modular systolic array architecture for full-search
block matching motion estimation," IEEE Transactions on Circuits and
Systems for video technology, Vol. No. 5, October 1995.