An Efficient Motion Recognition System Based on LMA Technique and a Discrete Hidden Markov Model

Human motion recognition has been extensively increased in recent years due to its importance in a wide range of applications, such as human-computer interaction, intelligent surveillance, augmented reality, content-based video compression and retrieval, etc. However, it is still regarded as a challenging task especially in realistic scenarios. It can be seen as a general machine learning problem which requires an effective human motion representation and an efficient learning method. In this work, we introduce a descriptor based on Laban Movement Analysis technique, a formal and universal language for human movement, to capture both quantitative and qualitative aspects of movement. We use Discrete Hidden Markov Model (DHMM) for training and classification motions. We improve the classification algorithm by proposing two DHMMs for each motion class to process the motion sequence in two different directions, forward and backward. Such modification allows avoiding the misclassification that can happen when recognizing similar motions. Two experiments are conducted. In the first one, we evaluate our method on a public dataset, the Microsoft Research Cambridge-12 Kinect gesture data set (MSRC-12) which is a widely used dataset for evaluating action/gesture recognition methods. In the second experiment, we build a dataset composed of 10 gestures(Introduce yourself, waving, Dance, move, turn left, turn right, stop, sit down, increase velocity, decrease velocity) performed by 20 persons. The evaluation of the system includes testing the efficiency of our descriptor vector based on LMA with basic DHMM method and comparing the recognition results of the modified DHMM with the original one. Experiment results demonstrate that our method outperforms most of existing methods that used the MSRC-12 dataset, and a near perfect classification rate in our dataset.




References:
[1] M. Ahmad and S.-W. Lee. Variable silhouette energy image
representations for recognizing human actions. Image and Vision
Computing, 28(5):814 – 824, 2010. Best of Automatic Face and Gesture
Recognition 2008.
[2] C. B. Barber, D. P. Dobkin, and H. Huhdanpaa. The quickhull algorithm
for convex hulls. ACM Trans. Math. Softw., 22(4):469–483, Dec. 1996.
[3] P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition
via sparse spatio-temporal features. In 2005 IEEE International
Workshop on Visual Surveillance and Performance Evaluation of
Tracking and Surveillance, pages 65–72, Oct 2005.
[4] S. Fothergill, H. Mentis, P. Kohli, and S. Nowozin. Instructing people
for training gestural interactive systems. In Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems, CHI ’12, pages
1737–1746, New York, NY, USA, 2012. ACM.
[5] M. E. Hussein, M. Torki, M. A. Gowayyed, and M. El-Saban. Human
action recognition using a temporal hierarchy of covariance descriptors
on 3d joint locations. In Proceedings of the Twenty-Third International
Joint Conference on Artificial Intelligence, IJCAI ’13, pages 2466–2472.
AAAI Press, 2013.
[6] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic
human actions from movies. In 2008 IEEE Conference on Computer
Vision and Pattern Recognition, pages 1–8, June 2008.
[7] A. M. Lehrmann, P. V. Gehler, and S. Nowozin. Efficient nonlinear
markov models for human motion. In 2014 IEEE Conference on
Computer Vision and Pattern Recognition, pages 1314–1321, June 2014.
[8] J. Macqueen. Some methods for classification and analysis of
multivariate observations. In In 5-th Berkeley Symposium on
Mathematical Statistics and Probability, pages 281–297, 1967.
[9] P. Matikainen, M. Hebert, and R. Sukthankar. Trajectons: Action
recognition through the motion analysis of tracked features. In 2009
IEEE 12th International Conference on Computer Vision Workshops,
ICCV Workshops, pages 514–521, Sept 2009.
[10] R. Messing, C. Pal, and H. Kautz. Activity recognition using the
velocity histories of tracked keypoints. In 2009 IEEE 12th International
Conference on Computer Vision, pages 104–111, Sept 2009.
[11] L. R. Rabiner and B. H. Juang. An introduction to hidden Markov
models. IEEE ASSP Magazine, pages 4–15, January 1986.
[12] L. Shao and X. Chen. Histogram of body poses and spectral regression
discriminant analysis for human action categorization. In BMVC, 2010.
[13] Y. Song, L. P. Morency, and R. Davis. Distribution-sensitive learning
for imbalanced datasets. In 2013 10th IEEE International Conference
and Workshops on Automatic Face and Gesture Recognition (FG), pages
1–6, April 2013.
[14] A. Truong and T. Zaharia. Dynamic gesture recognition with laban
movement analysis and hidden markov models. In Proceedings of the
33rd Computer Graphics International, CGI ’16, pages 21–24, New
York, NY, USA, 2016. ACM.
[15] H. Wang, A. Kl¨aser, C. Schmid, and C.-L. Liu. Dense trajectories
and motion boundary descriptors for action recognition. International
Journal of Computer Vision, 103(1):60–79, May 2013.
[16] X. Yang and Y. Tian. Effective 3d action recognition using eigenjoints.
Journal of Visual Communication and Image Representation, 25(1):2 –
11, 2014. Visual Understanding and Applications with RGB-D Cameras.
[17] M. Zanfir, M. Leordeanu, and C. Sminchisescu. The moving pose: An
efficient 3d kinematics descriptor for low-latency action recognition and
detection. In 2013 IEEE International Conference on Computer Vision,
pages 2752–2759, Dec 2013.