Faster Pedestrian Recognition Using Deformable Part Models

Deformable part models achieve high precision in
pedestrian recognition, but all publicly available implementations are
too slow for real-time applications. We implemented a deformable
part model algorithm fast enough for real-time use by exploiting
information about the camera position and orientation. This
implementation is both faster and more precise than alternative
DPM implementations. These results are obtained by computing
convolutions in the frequency domain and using lookup tables to
speed up feature computation. This approach is almost an order of
magnitude faster than the reference DPM implementation, with no
loss in precision. Knowing the position of the camera with respect to
horizon it is also possible prune many hypotheses based on their
size and location. The range of acceptable sizes and positions is
set by looking at the statistical distribution of bounding boxes in
labelled images. With this approach it is not needed to compute the
entire feature pyramid: for example higher resolution features are
only needed near the horizon. This results in an increase in mean
average precision of 5% and an increase in speed by a factor of
two. Furthermore, to reduce misdetections involving small pedestrians
near the horizon, input images are supersampled near the horizon.
Supersampling the image at 1.5 times the original scale, results in
an increase in precision of about 4%. The implementation was tested
against the public KITTI dataset, obtaining an 8% improvement in
mean average precision over the best performing DPM-based method.
By allowing for a small loss in precision computational time can be
easily brought down to our target of 100ms per image, reaching a
solution that is faster and still more precise than all publicly available
DPM implementations.




References:
[1] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan,
“Object detection with discriminatively trained part-based models,”
Pattern Analysis and Machine Intelligence, IEEE Transactions on,
vol. 32, no. 9, pp. 1627–1645, 2010.
[2] P. F. Felzenszwalb, R. B. Girshick, and D. McAllester, “Cascade object
detection with deformable part models,” in Computer vision and pattern
recognition (CVPR), 2010 IEEE conference on. IEEE, 2010, pp.
2241–2248.
[3] “Discriminatively trained deformable part models, release
4,” http://people.cs.uchicago.edu/ pff/latent-release4/, (Accessed:
2015-09-30).
[4] M. A. Sadeghi and D. Forsyth, “Fast template evaluation with vector
quantization,” in Advances in Neural Information Processing Systems,
2013, pp. 2949–2957.
[5] H. Jegou, M. Douze, and C. Schmid, “Product quantization for nearest
neighbor search,” Pattern Analysis and Machine Intelligence, IEEE
Transactions on, vol. 33, no. 1, pp. 117–128, 2011.
[6] C. Dubout and F. Fleuret, “Exact acceleration of linear object detectors,”
in Computer Vision–ECCV 2012. Springer, 2012, pp. 301–311.
[7] J. Yan, Z. Lei, L. Wen, and S. Z. Li, “The fastest deformable part model
for object detection.”
[8] M. Frigo and S. G. Johnson, “The design and implementation of
FFTW3,” Proceedings of the IEEE, vol. 93, no. 2, pp. 216–231, 2005,
special issue on “Program Generation, Optimization, and Platform
Adaptation”.
[9] G. G. Benot Jacob, “Eigen library v3.1,” http://eigen.tuxfamily.org, 2014,
(Accessed: 2015-09-30).
[10] M. Sadeghi and D. Forsyth, “30hz object detection with dpm v5,” in
Computer Vision ECCV 2014, ser. Lecture Notes in Computer Science,
2014, vol. 8689, pp. 65–79.
[11] P. Doll´ar, R. Appel, S. Belongie, and P. Perona, “Fast feature pyramids
for object detection,” Pattern Analysis and Machine Intelligence, IEEE
Transactions on, vol. 36, no. 8, pp. 1532–1545, 2014.
[12] N. Dalal and B. Triggs, “Histograms of oriented gradients for human
detection,” in Computer Vision and Pattern Recognition, 2005. CVPR
2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp.
886–893.
[13] S. Imahori, M. Yagiura, and H. Nagamochi, “Practical algorithms for
two-dimensional packing,” Handbook of Approximation Algorithms and
Metaheuristics. Chapman & Hall/CRC Computer & Information Science
Series, vol. 13, 2007.
[14] C. Premebida, J. Carreira, J. Batista, and U. Nunes, “Pedestrian detection
combining rgb and dense lidar data,” in IROS, 2014.
[15] J. Xu, S. Ramos, D. Vzquez, and A. M. Lpez, “Hierarchical Adaptive
Structural SVM for Domain Adaptation,” in arXiv:1408.5400, 2014.
[16] J. Yebes, L. M. Bergasa, R. Arroyo, and A. Lzaro, “Supervised learning
and evaluation of KITTI’s cars detector with DPM,” in IV, Detroit, USA,
June 2014, pp. 768–773.
[17] A. Geiger, “Kitti object detection evaluation,”
http://www.cvlibs.net/datasets/kitti/eval object.php, 2014, (Accessed:
2015-09-30).