Low Power and Less Area Architecture for Integer Motion Estimation

Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.





References:
[1] T.-C. Chen, Y.-H. Chen, S.-F. Tsai, S.-Y. Chien, and L.-G. Chen,
"Fast algorithm and architecture design of low-power integer
motion estimation for h.264/avc," IEEE Trans. on Circuits and
Systems for Video technology, vol. 17, MAY. 2007.
[2] I. Richardson, H.264 and MPEG-4 Video compression. John
Wiley and Sons Ltd, 2003.
[3] Z.Wujian and Z. Runde, "A high-throughput systolic array for
motion estimation using adaptive bit resolution," IEEE Trans.
on Circuits and Systems for Video technology, pp. 378-381,
Mar. 2001.
[4] J. Olivares, J. Hormigo, J. Villalba, and I. Benavides, "Minimum
sum of absolute differences implementation in a single
fpga device," Lecture Notes in Computer Science, Springer, vol.
3203, pp. 986-990, Aug. 2004.
[5] M.H.Brian and H.W.Leong, "Serial and parallel fpga-based
variable block size motion estimation processors," Journal of
Signal Proc.Systems, vol. 51, pp. 77-98, Aug. 2007.
[6] K. Lam and C. Tsui, "Low power 2-d array vlsi architecture
for block matching motion estimation using computation suspension,"
IEEE Workshop on Signal Proc. Systems, pp. 60-69,
2000.
[7] S. Lpez, F. Tobajas, A. Villar, V. de Armas, J. F. Lpez, and
R. Sarmiento, "Low cost efficient architecture for h.264 motion
estimation," IEEE International Symposium on Circuits and
Systems, vol. 1, pp. 412-415, May 2005.
[8] Z. Liu, Y. Huang, Y. Song, S. Goto, and T. I. IPS, "Hardwareefficient
propagate partial sad architecture for variable block
size motion estimation in h.264/avc," Great Lakes Symposium
on VLSI, March 2007.
[9] Y.-W. Huang, C.-Y. Chen, C.-H. Tsai, C.-F. Shen, and L.-G.
Chen, "Survey on block matching motion estimation algorithms
and architectures with new results," Journal of VLSI Signal
Processing, vol. 42, no. 8, pp. 297-320, 2006.
[10] L.Zhang and W. Gao, "Improved fsbm algorithm and its vlsi
architecture for variable block size motion estimation of h.264,"
International Symposium on Intellig. Signal Pro.Comm. Syst.,
pp. 445-448, 2005.
[11] C. Chen, S. Chien, Y. Huang, T. Chen, T. Wang, and L.G.Chen,
"Analysis and architecture design of variable block-size motion
estimation for h.264/avc," IEEE Trans. on Circuits and Systems,
vol. 53(2), p. 578593, 2006.
[12] A. Campos, F. Merelo, M.A.M.Peiro, and J. Esteve, "Integerpixel
motion estimation h.264/avc accelerator architecture with
optimal memory management," Microprocessors and Microsystems
in Elsevier, 2007.