Recognizing an Individual, Their Topic of Conversation, and Cultural Background from 3D Body Movement

The 3D body movement signals captured during
human-human conversation include clues not only to the content of
people’s communication but also to their culture and personality.
This paper is concerned with automatic extraction of this information
from body movement signals. For the purpose of this research, we
collected a novel corpus from 27 subjects, arranged them into groups
according to their culture. We arranged each group into pairs and
each pair communicated with each other about different topics.
A state-of-art recognition system is applied to the problems of
person, culture, and topic recognition. We borrowed modeling,
classification, and normalization techniques from speech recognition.
We used Gaussian Mixture Modeling (GMM) as the main technique
for building our three systems, obtaining 77.78%, 55.47%, and
39.06% from the person, culture, and topic recognition systems
respectively. In addition, we combined the above GMM systems with
Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and
40.63% accuracy for person, culture, and topic recognition
respectively.
Although direct comparison among these three recognition
systems is difficult, it seems that our person recognition system
performs best for both GMM and GMM-SVM, suggesting that intersubject
differences (i.e. subject’s personality traits) are a major
source of variation. When removing these traits from culture and
topic recognition systems using the Nuisance Attribute Projection
(NAP) and the Intersession Variability Compensation (ISVC)
techniques, we obtained 73.44% and 46.09% accuracy from culture
and topic recognition systems respectively.





References:
[1] R. A., Bolt, “Put that there: Voice and gesture at the graphics interface”,
Computer Graphics, vol. 14(3), pp. 262-270, 1980
[2] D. McNeil, “Hand and Mind: What gestures Reveal about Thought”,
p.37, 1992
[3] D. McNeill, “So You Think Gestures are Nonverbal?”, Psychological
review, vol. 92 (3), p. 350, 1985
[4] M. Rehm, N. Bee, and E. André, “Wave like an Egyptian: accelerometer
based gesture recognition for culture specific interactions”, In
Proceedings of the 22nd British HCI Group Annual Conference on
People and Computers: Culture, Creativity, Interaction, vol. 1, pp. 13-
22, 2008.
[5] F. Cavicchio and S. Kita, “English/Italian Bilinguals Switch Gesture
Parameters when they Switch Languages” in The Proceedings of Tilburg
Gesture Research Meeting, pp. 305-309, 2013
[6] G. W. Allport, “Personality: A psychological Interpretation”, New York:
Holt, 1937
[7] S. Bird, S. Browning, R. Moore and M. J. Russell, “Dialogue move
recognition using topic spotting techniques”, Proc. ESCA Workshop on
Spoken Dialogue Processing -Theory and Practic, Vigsø, Denmark, pp.
45-48, 1995
[8] B. Dhillon, R. Kocielnik, I. Politis, M. Swerts, and D. Szostak, “Culture
and facial expressions: a case study with a speech interface”, In Human-
Computer Interaction–INTERACT 2011, Springer Berlin Heidelberg,
pp. 392-404, 2011.
[9] G. Zen, E. Sangineto, E. Ricci, and N. Sebe, “Unsupervised Domain
Adaptation for Personalized Facial Emotion Recognition”. In
Proceedings of the 16th International Conference on Multimodal
Interaction, pp. 128-135, 2014
[10] A. Hanani, M.J. Russell and M. J. Carey, “Human and computer
recognition of regional accents and ethnic groups from British English
speech”. Computer Speech & Language, vol. 27(1), 59-74, 2013.
[11] K. C. Sim, S. Zhao, K. Yu, and H. Liao, “ICMI'12 grand challenge:
haptic voice recognition”, In Proceedings of the 14th ACM international
conference on Multimodal interaction, pp. 363-370, 2012.
[12] G. W. Allport, “Pattern Growth in Personality”, New York: Holt,
Rinehart, and Winston, p48, 1937.
[13] G. W. Allport and P. E. Vernon, “Studies in Expressive Movement”,
p28, 1933.
[14] S. D. Kelly, S. M. Manning and S. Rodak, “Gesture Gives a Hand to
Language and Learning: Perspectives from Cognitive Neuroscience,
Developmental Psychology and Education”, Language and Linguistics
Compass, vol. 2(4), pp. 569-588, 2008.
[15] L. S. Nguyen, A. Marcos-Ramiro, M. Marrón Romera, and D. Gatica-
Perez, D. “Multimodal analysis of body communication cues in
employment interviews”. In Proceedings of the 15th ACM on
International conference on multimodal interaction, pp. 437-444, 2013.
[16] M. Valstar, “Automatic Behaviour Understanding in Medicine”.
International Conference on Multimodal Interaction (ICMI), 2014
[17] B. De Raad and H. Schouwenburg, “Personality in Learning and
Education: A Review”, vol. 10, pp. 303-336, 1996.
[18] A. Pena, and A. De Antonio, “Inferring interaction to support
collaborative learning in 3D virtual environments through the user's
avatar Non-Verbal Communication”, International Journal of
Technology Enhanced Learning, vol, 2(1), pp. 75-90, 2010.
[19] E. S. Kluft, J. Poteat and R. P. Kluft, “Movement observations in
multiple personality disorder: A preliminary report”, American Journal
of Dance Therapy, vol. 9 (1), pp. 31-46, 2006.
[20] L. Chittaro and M. Serra, “Behavioural Programming of Autonomous
Characters Based on Probabilistic Automata and Personality”, vol. 15(3-
4), pp. 319-326, 2004.
[21] H. Kim, S. S. Kwak, and M. Kim, “Personality design of sociable robots
by control of gesture design factors”. In Robot and Human Interactive
Communication, 2008. RO-MAN 2008. The 17th IEEE International
Symposium on, pp. 494-499, 2008
[22] M. S. Nixon, and J. N. Carter, “Automatic recognition by gait”,
Proceedings of the IEEE, vol. 94(11), pp. 2013-2024, 2006.
[23] J. P. Singh and S. Jain, “Person Identification Based on Gait using
Dynamic Body Parameters”, In Trendz in Information Sciences &
Computing (TISC), pp. 248-252, 2010.
[24] A. K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric
recognition”, Circuits and Systems for Video Technology, IEEE
Transactions on, vol. 14(1), pp. 4-20, 2004
[25] I. B. Myers and P. B. Myers, “Gifts differing: Understanding Personality
Type”, Nicholas Brealey Publishing, 2010
[26] I. Jraidi, M. Chaouachi and C. Frasson, “A dynamic multimodal
approach for assessing learners' interaction experience”, In Proceedings
of the 15th ACM on International conference on multimodal interaction
pp. 271-278, ACM, 2013.
[27] W. M. C. J. P. Campbell, D. A. Reynolds, E. Singer and P. A. Torres-
Carrasquillo, “Support vector machines for speaker and language
recognition”, Computer Speech & Language, vol. 20 (2), pp. 210-229,
2006
[28] A. Hanani, M. Carey and M. J. Russell, “Improved language recognition
using mixture components statistics”, In INTERSPEECH, pp. 741-744,
2010.
[29] R. Vogt and S. Sridharan, “Experiments in session variability modelling
for speaker verification”, In Acoustics, Speech and Signal Processing,
2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference
on vol. 1, pp. I-I, 2006.
[30] W. Campbel, D. Sturim, D. Reynolds and A. Solomonoff, “SVM based
speaker verification using a GMM supervector kernel and NAP
variability compensation”. In Acoustics, Speech and Signal Processing,
2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference
on vol. 1, pp. I-I, 2006.
[31] C. Vair, D. Colibro, F. Castaldo, E. Dalmasso, and P. Laface, P.
“Channel factors compensation in model and feature domain for speaker
recognition”, In Speaker and Language Recognition Workshop. IEEE
Odyssey, pp. 1-6, 2006.
[32] Qualisys, “Qualisys Track Manager”, QTM, 2006
[33] J. C. Gower, “Euclidean Distance Matrices’ in ‘Convex optimization
and Euclidean distance geometry”, J. Dattorro, ed., Lulu. Com, 2008
[34] J. Gauvain, and L., Lee, C., “Maximum A-Posteriori Estimation for
Multivariate Gaussian Mixture Observations of Markov Chains”, IEEE
Transactions on Speech and Audio Processing vol.2, pp. 291–298, 1994.
[35] C. M. Bishop, “Pattern Recognition and Machine Learning”, Springer,
New York, p84-87, 2006.
[36] S. Canu, Y. Grandvalet, V. Guigue, and A. Rakotomamonjy, “SVM and
Kernel Methods Matlab Toolbox”, Perception Systèmes et Information,
INSA de Rouen, Rouen, France, http://asi.insa-rouen.fr/
enseignants/arakotom/toolbox/, 2005.
[37] R. Vergin, A. Farhat and D. O'Shaughnessy, “Robust gender-dependent
acoustic-phonetic modelling in continuous speech recognition based on
a new automatic male/female classification”, In Spoken Language,
ICSLP 96, Proceedings, Fourth International Conference vol. 2, pp.
1081-1084, 1996.