Face Reconstruction and Camera Pose Using Multi-dimensional Descent

This paper aims to propose a novel, robust, and simple method for obtaining a human 3D face model and camera pose (position and orientation) from a video sequence. Given a video sequence of a face recorded from an off-the-shelf digital camera, feature points used to define facial parts are tracked using the Active- Appearance Model (AAM). Then, the face-s 3D structure and camera pose of each video frame can be simultaneously calculated from the obtained point correspondences. This proposed method is primarily based on the combined approaches of Gradient Descent and Powell-s Multidimensional Minimization. Using this proposed method, temporarily occluded point including the case of self-occlusion does not pose a problem. As long as the point correspondences displayed in the video sequence have enough parallax, these missing points can still be reconstructed.





References:
[1] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P.
Flannery, "Numerical Recipes - The Art of Scientific
Computing," 3rd ed., Cambridge, 2007.
[2] C. B. Barber, D. P. Dobkin, and H. Huhdanpaa, "The Quickhull
Algorithm for Convex Hulls," ACM Transactions on
Mathematical Software, vol. 22, no. 4, pp. 469-483, Dec 1996.
[3] V. Chouvatut and S. Madarasmi, "A Comparison of Two
Camera Pose Methods for Augmented Reality," 7th IASTED
International Conference on Signal and Image Processing (SIP),
pp. 554-559, 15-17 Aug 2005.
[4] V. Chouvatut and S. Madarasmi, "Estimation of Camera Pose
for Use in Augmented Reality System," 20th International
Technical Conference on Circuits/Systems, Computers, and
Communications (ITC-CSCC), Vol. 3, pp. 979-980, 4-7 Jul
2005.
[5] R.Y. Tsai, "A Versatile Camera Calibration Technique for High-
Accuracy 3D Machine Vision Metrology Using Off-the-Shelf
TV Camera and Lenses," IEEE Journal of Robotics and
Automation, Vol. RA-3, Issue 4, pp. 323-344, Aug 1987.
[6] H. Kato and M. Billinghurst, "Marker Tracking and HMD
Calibration for a Video-Based Augmented Reality Conferencing
System," Proceeding 2nd IEEE and ACM International
Workshop on Augmented Reality, pp. 85-94, Oct 1999.
[7] T. Okuma, K. Sakaue, H. Takemura, and N. Yokoya, "Real-
Time Camera Parameter Estimation from images for a Mixed
Reality System," IEEE Proceeding 15th International Conference
on Pattern Recognition, Vol. 4, pp. 482-486, 3-7 Sep 2000.
[8] R. I. Hartley, "Projective Reconstruction and Invariants from
Multiple Images," IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 16, Issue 10, pp. 1036-1041, Oct
1994.
[9] S. Avidan and A. Shashua, "Novel View Synthesis by
Cascading Trilinear Tensors," IEEE Transactions on
Visualization and Computer Graphics, Vol. 4, Issue 4, pp. 293-
306, Oct-Dec 1998.
[10] R. Hartley and A. Zisserman, "Multiple View Geometry in
Computer Vision," 2nd ed., Cambridge, 2006.
[11] J. Li and R. Chellappa, "A Factorization Method for Structure
from Planar Motion", IEEE Workshop on Motion and Video
Computing (WACV/MOTIONS), Vol. 2, pp. 154-159, Jan 2005.
[12] N. B. Karayiannis, "Reformulated Radial Basis Neural
Networks Trained by Gradient Descent", IEEE Transactions on
Neural Networks, Vol. 10, Issue 3, pp. 657-671, May 2000.
[13] O.T.-C. Chen, "Motion Estimation Using a One-Dimensional
Gradient Descent Search", IEEE Transactions on Circuits and
Systems for Video Technology, Vol. 10, Issue 4, pp. 608-616,
Jun 2000.
[14] J.J. Guerrero and C. Sagues, "Estimating the Motion Direction
from Brightness Gradient on Lines", IEEE Transactions on
Systems, Man, and Cybernetics - Part C: Applications and
Reviews, Vol. 31, Issue 3, pp. 419-426, Aug 2001.
[15] L.M. Po, K.H. Ng, K.W. Cheung, K.M. Wong, Y. Uddin, and
C.W. Ting, "Novel Directional Gradient Descent Searches for
Fast Block Motion Estimation", IEEE Transactions on Circuits
and Systems for Video Technology, Vol. 19, Issue 8, pp. 1189-
1195, Aug 2009.
[16] A. Smolic, "Robust Generation of 360-Degree Panoramic Views
from Consumer Video Sequences", 4th EURASIP-IEEE Region
8 International Symposium on Video/Image Processing and
Multimedia Communications (VIPromCom), pp. 431-435, 16-19
Jun 2002.
[17] A.M. Sasson, "Combined Use of the Powell and Fletcher -
Powell Nonlinear Programming Methods for Optimal Load
Flows", IEEE Transactions on Power Apparatus and Systems,
Vol. PAS-88, Issue 10, pp. 1530-1537, Oct 1969.
[18] X. Xu and R.D. Dony, "Differential Evolution with Powell-s
Direction Set Method in Medical Image Registration", IEEE
International Symposium on Biomedical Imaging: Nano to
Micro, Vol. 1, pp. 732-735, 15-18 Apr 2004.
[19] G.J. Edwards, C.J. Taylor, and T.F. Cootes, "Interpreting Face
Images using Active Appearance Models", 3rd IEEE
International Conference on Automatic Face and Gesture
Recognition, pp. 300-305, 14-16 Apr 1998.
[20] T.F. Cootes, G.J. Edwards, and C.J. Taylor, "Active Appearance
Models", International Proceedings European Conference on
Computer Vision, Vol. 2, pp. 484-498, 1998.
[21] S. W. Park, J. Heo, and M. Savvides, "3D Face Reconstruction
from a Single 2D Face Image," IEEE Computer Society
Conference on Computer Vision and Pattern Recognition
Workshops (CVPR), pp. 1-8, 23-28 Jun 2008.
[22] Y. Zheng, J. Chang, Z. Zheng, and Z. Wang, "3D Face
Reconstruction from Stereo: A Model Based Approach," IEEE
International Conference on Image Processing (ICIP), Vol. 3,
pp. III-65 - III-68, 16 Sep 2007 - 19 Oct 2007.
[23] Y. Zheng and Z. Wang, 2008, "Robust Depth Estimation for
Efficient 3D Face Reconstruction," 15th IEEE International
Conference on Image Processing, pp. 1516-1519, 12-15 Oct
2008.