Graph-based High Level Motion Segmentation using Normalized Cuts

Motion capture devices have been utilized in producing several contents, such as movies and video games. However, since motion capture devices are expensive and inconvenient to use, motions segmented from captured data was recycled and synthesized to utilize it in another contents, but the motions were generally segmented by contents producers in manual. Therefore, automatic motion segmentation is recently getting a lot of attentions. Previous approaches are divided into on-line and off-line, where on-line approaches segment motions based on similarities between neighboring frames and off-line approaches segment motions by capturing the global characteristics in feature space. In this paper, we propose a graph-based high-level motion segmentation method. Since high-level motions consist of several repeated frames within temporal distances, we consider all similarities among all frames within the temporal distance. This is achieved by constructing a graph, where each vertex represents a frame and the edges between the frames are weighted by their similarity. Then, normalized cuts algorithm is used to partition the constructed graph into several sub-graphs by globally finding minimum cuts. In the experiments, the results using the proposed method showed better performance than PCA-based method in on-line and GMM-based method in off-line, as the proposed method globally segment motions from the graph constructed based similarities between neighboring frames as well as similarities among all frames within temporal distances.




References:
[1] J. Barbic, A. Safonova, J.-Y. Pan, C. Faloutsos, J.K. Hodgins, and N.S.
¶ÇÇ│¶ÇüƶÇüŶÇüŶÇüä¶Çüò¶Çüç¶ÇÇŶÇÇâ ¶Çé│¶ÇǶÇüê¶Çüè¶ÇüɶÇüê¶Çüæ¶Çüù¶Çüî¶Çüæ¶Çüè¶ÇÇâ ¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ ¶ÇǪ¶Çüä¶Çüô¶Çüù¶Çüÿ¶Çüò¶Çüê¶ÇÇâ ¶ÇǺ¶Çüä¶Çüù¶Çüä¶ÇÇâ ¶Çüî¶Çüæ¶Çüù¶ÇüƶÇÇâ ¶ÇǺ¶Çüî¶Çüû¶Çüù¶Çüî¶Çüæ¶Çüå¶Çüù¶ÇÇâ ¶ÇÇѶÇüê¶Çüï¶Çüä¶ÇüÖ¶Çüî¶ÇüƶÇüò¶Çüû¶Çé┤¶ÇÇŶÇÇâ
Proceedings of ACM International Conference on Graphics Interface, Vol.
62, pp. 185-194, 2004.
[2] T. Ki¶ÇüɶÇÇŶÇÇâ ¶ÇǶÇÇæ¶ÇÇâ ¶ÇÇ│¶Çüä¶Çüò¶ÇüĶÇÇŶÇÇâ ¶Çüä¶Çüæ¶Çüç¶ÇÇâ ¶ÇǶÇÇæ¶ÇÇâ ¶ÇǶÇüï¶Çüî¶Çüæ¶ÇÇŶÇÇâ ¶Çé│¶ÇÇÁ¶Çüï¶Çü£¶Çüù¶Çüï¶ÇüɶÇüî¶Çüå-Motion Synthesis based on
Motion-¶ÇÇѶÇüê¶Çüä¶Çüù¶ÇÇâ¶ÇÇñ¶Çüæ¶Çüä¶ÇüŶÇü£¶Çüû¶Çüî¶Çüû¶Çé┤¶ÇÇŶÇÇâ¶ÇÇñ¶ÇǪ¶ÇÇ░¶ÇÇâ¶ÇÇÀ¶Çüò¶Çüä¶Çüæ¶Çüû¶Çüä¶Çüå¶Çüù¶Çüî¶ÇüƶÇüæ¶Çüû¶ÇÇâ¶ÇüƶÇüæ¶ÇÇâ¶ÇǬ¶Çüò¶Çüä¶Çüô¶Çüï¶Çüî¶Çüå¶Çüû¶ÇÇŶÇÇâ¶ÇÇ╣¶ÇüƶÇüŶÇÇæ¶ÇÇâ¶ÇÇò¶ÇÇò¶ÇÇŶÇÇâ¶ÇÇò¶ÇÇô¶ÇÇô¶ÇÇû¶ÇÇŶÇÇâ
pp. 392-401.
[3] ¶ÇǪ¶ÇÇæ¶ÇÇâ¶ÇÇ»¶Çüê¶Çüê¶ÇÇâ¶Çüä¶Çüæ¶Çüç¶ÇÇâ¶ÇÇñ¶ÇÇæ¶ÇÇâ¶ÇÇ¿¶ÇüŶÇüè¶Çüä¶ÇüɶÇüɶÇüä¶ÇüŶÇÇŶÇÇâ¶Çé│¶Çǽ¶Çüÿ¶ÇüɶÇüä¶Çüæ¶ÇÇâ¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ¶ÇǶÇü£¶Çüæ¶Çüù¶Çüï¶Çüê¶Çüû¶Çüî¶Çüû¶ÇÇâ¶Çüà¶Çü£¶ÇÇâ¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ¶ÇüɶÇüä¶Çüæ¶Çüî¶Çüë¶ÇüƶÇüŶÇüç¶ÇÇâ
¶ÇÇ»¶Çüê¶Çüä¶Çüò¶Çüæ¶Çüî¶Çüæ¶Çüè¶ÇÇâ ¶Çüä¶Çüæ¶Çüç¶ÇÇâ ¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ ¶ÇÇ│¶Çüò¶Çüî¶ÇüɶÇüî¶Çüù¶Çüî¶ÇüÖ¶Çüê¶ÇÇâ ¶ÇǶÇüê¶Çüè¶ÇüɶÇüê¶Çüæ¶Çüù¶Çüä¶Çüù¶Çüî¶ÇüƶÇüæ¶Çé┤¶ÇÇŶÇÇâ ¶ÇÇ│¶Çüòoceedings of
International Conference on Articulated Motion and Deformable Objects,
Lecture Notes in Computer Science, Vol. 4069, 2006, pp. 259-266.
[4] ¶ÇÇ╝¶ÇÇæ¶ÇÇâ¶ÇǶÇüä¶ÇüĶÇüä¶ÇüɶÇüƶÇüù¶ÇüƶÇÇŶÇÇâ¶ÇǶÇÇæ¶ÇÇâ¶ÇÇ«¶Çüÿ¶Çüò¶Çüî¶Çü£¶Çüä¶ÇüɶÇüä¶ÇÇŶÇÇâ¶Çüä¶Çüæ¶Çüç¶ÇÇâ¶ÇÇÀ¶ÇÇæ¶ÇÇâ¶ÇÇ«¶Çüä¶Çüæ¶Çüê¶ÇüĶÇüƶÇÇŶÇÇâ¶Çé│¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ¶ÇÇ░¶Çüä¶Çüô¶ÇÇضÇÇâ¶ÇǼ¶ÇüɶÇüä¶Çüè¶Çüê-based
Retrieval and Segmentation of Motion Data¶Çé┤¶ÇÇŶÇÇâ ¶ÇÇ│¶Çüò¶ÇüƶÇüå¶Çüê¶Çüê¶Çüç¶Çüî¶Çüæ¶Çüè¶Çüû¶ÇÇâ ¶ÇüƶÇüë¶ÇÇâ ¶ÇÇñ¶ÇǪ¶ÇÇ░¶ÇÇâ
SIGGRAPH/Eurographics Symposium on Computer Animation, pp.
259-266, 2004.
[5] ¶ÇÇñ¶ÇÇæ¶ÇÇâ ¶ÇÇ®¶ÇüƶÇüç¶ÇÇŶÇÇâ ¶ÇÇ░¶ÇÇæ¶ÇÇâ ¶ÇÇ¡¶ÇÇæ¶ÇÇâ ¶ÇÇ░¶Çüä¶Çüù¶Çüä¶Çüò¶Çüî¶Çüå¶ÇÇŶÇÇâ ¶Çüä¶Çüæ¶Çüç¶ÇÇâ ¶ÇÇ▓¶ÇÇæ¶ÇÇâ ¶ÇÇ¡¶Çüê¶Çüæ¶ÇüĶÇüî¶Çüæ¶Çüû¶ÇÇŶÇÇâ ¶Çé│¶ÇÇñ¶Çüÿ¶Çüù¶ÇüƶÇüɶÇüä¶Çüù¶Çüê¶Çüç¶ÇÇâ ¶ÇǺ¶Çüê¶Çüò¶Çüî¶ÇüÖ¶Çüä¶Çüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ ¶ÇüƶÇüë¶ÇÇâ
¶ÇÇ│¶Çüò¶Çüî¶ÇüɶÇüî¶Çüù¶Çüî¶ÇüÖ¶Çüê¶Çüû¶ÇÇâ¶Çüë¶ÇüƶÇüò¶ÇÇâ¶ÇÇ░¶ÇüƶÇüÖ¶Çüê¶ÇüɶÇüê¶Çüæ¶Çüù¶ÇÇâ¶ÇǪ¶ÇüŶÇüä¶Çüû¶Çüû¶Çüî¶Çüë¶Çüî¶Çüå¶Çüä¶Çüù¶Çüî¶ÇüƶÇüæ¶Çé┤¶ÇÇŶÇÇâ¶ÇÇñ¶Çüÿ¶Çüù¶ÇüƶÇüæ¶ÇüƶÇüɶÇüƶÇüÿ¶Çüû¶ÇÇâ¶ÇÇÁ¶ÇüƶÇüà¶ÇüƶÇüù¶Çüû¶ÇÇŶÇÇâ¶ÇÇ╣¶ÇüƶÇüŶÇÇæ¶ÇÇâ¶ÇÇö¶ÇÇò¶ÇÇŶÇÇâ
No. 1, 2002, pp. 39-54.
[6] T. ¶ÇÇ«¶ÇüܶÇüƶÇüæ¶ÇÇâ ¶Çüä¶Çüæ¶Çüç¶ÇÇâ ¶ÇǶÇÇæ¶ÇÇâ ¶ÇǶÇüï¶Çüî¶Çüæ¶ÇÇŶÇÇâ ¶Çé│¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ ¶ÇÇ░¶ÇüƶÇüç¶Çüê¶ÇüŶÇüî¶Çüæ¶Çüè¶ÇÇâ ¶Çüë¶ÇüƶÇüò¶ÇÇâ ¶ÇÇ▓¶Çüæ-line Locomotion
¶ÇǶÇü£¶Çüæ¶Çüù¶Çüï¶Çüê¶Çüû¶Çüî¶Çüû¶Çé┤¶ÇÇŶÇÇâ¶ÇÇ│¶Çüò¶ÇüƶÇüå¶Çüê¶Çüê¶Çüç¶Çüî¶Çüæ¶Çüè¶Çüû¶ÇÇâ¶ÇüƶÇüë¶ÇÇâ¶ÇÇñ¶ÇǪ¶ÇÇ░¶ÇÇâ¶ÇǶÇǼ¶ÇǬ¶ÇǬ¶ÇÇÁ¶ÇÇñ¶ÇÇ│¶Çǽ¶ÇÇƶÇÇâ¶ÇÇ¿¶Çüÿ¶Çüò¶ÇüƶÇüè¶Çüò¶Çüä¶Çüô¶Çüï¶Çüî¶Çüå¶Çüû¶ÇÇâ¶ÇǶÇü£¶ÇüɶÇüô¶ÇüƶÇüû¶Çüî¶Çüÿ¶ÇüɶÇÇâ
on Computer Animation, pp. 29-38, 2005.
[7] ¶ÇÇÀ¶ÇÇæ¶ÇÇâ¶ÇÇ╝¶Çüä¶ÇüɶÇüä¶Çüû¶Çüä¶ÇüĶÇüî¶ÇÇâ¶Çüä¶Çüæ¶Çüç¶ÇÇâ¶ÇÇ«¶ÇÇæ¶ÇÇâ¶ÇÇñ¶Çüî¶ÇüضÇüä¶ÇüܶÇüä¶ÇÇŶÇÇâ¶Çé│¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ¶ÇǶÇüê¶Çüè¶ÇüɶÇüê¶Çüæ¶Çüù¶Çüä¶Çüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ¶Çüä¶Çüæ¶Çüç¶ÇÇâ¶ÇÇÁ¶Çüê¶Çüù¶Çüò¶Çüî¶Çüê¶ÇüÖ¶Çüä¶ÇüŶÇÇâ¶Çüë¶ÇüƶÇüò¶ÇÇâ¶ÇÇû¶ÇǺ¶ÇÇâ
Video based on Modified S¶Çüï¶Çüä¶Çüô¶Çüê¶ÇÇâ ¶ÇǺ¶Çüî¶Çüû¶Çüù¶Çüò¶Çüî¶Çüà¶Çüÿ¶Çüù¶Çüî¶ÇüƶÇüæ¶Çé┤¶ÇÇŶÇÇâ ¶ÇÇ¿¶ÇÇ©¶ÇÇÁ¶ÇÇñ¶ÇǶÇǼ¶ÇÇ│¶ÇÇâ ¶ÇÇ¡¶ÇüƶÇüÿ¶Çüò¶Çüæ¶Çüä¶ÇüŶÇÇâ ¶ÇüƶÇüæ¶ÇÇâ
Advances in Signal Processing, Vol. 2007, No. 2, 2007, pp. 1-11.
[8] ¶ÇǺ¶ÇÇæ¶ÇÇâ¶ÇÇѶÇüƶÇüÿ¶Çüå¶Çüï¶Çüä¶Çüò¶Çüç¶ÇÇâ¶Çüä¶Çüæ¶Çüç¶ÇÇâ¶ÇÇ▒¶ÇÇæ¶ÇÇâ¶ÇÇѶÇüä¶Çüç¶ÇüŶÇüê¶Çüò¶ÇÇŶÇÇâ¶Çé│¶ÇǶÇüê¶ÇüɶÇüä¶Çüæ¶Çüù¶Çüî¶Çüå¶ÇÇâ¶ÇǶÇüê¶Çüè¶ÇüɶÇüê¶Çüæ¶Çüù¶Çüä¶Çüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ¶ÇüƶÇüë¶ÇÇâ¶ÇÇ░¶ÇüƶÇüù¶Çüî¶ÇüƶÇüæ¶ÇÇâ¶ÇǪ¶Çüä¶Çüô¶Çüù¶Çüÿ¶Çüò¶Çüê¶ÇÇâ
¶Çüÿ¶Çüû¶Çüî¶Çüæ¶Çüè¶ÇÇâ ¶ÇÇ»¶Çüä¶Çüà¶Çüä¶Çüæ¶ÇÇâ ¶ÇÇ░¶ÇüƶÇüÖ¶Çüê¶ÇüɶÇüê¶Çüæ¶Çüù¶ÇÇâ ¶ÇÇñ¶Çüæ¶Çüä¶ÇüŶÇü£¶Çüû¶Çüî¶Çüû¶Çé┤¶ÇÇŶÇÇâ ¶ÇÇ│¶Çüò¶ÇüƶÇüå¶Çüê¶Çüê¶Çüç¶Çüî¶Çüæ¶Çüè¶Çüû¶ÇÇâ ¶ÇüƶÇüë¶ÇÇâ ¶ÇǼ¶Çüæ¶Çüù¶Çüê¶Çüò¶Çüæ¶Çüä¶Çüù¶Çüî¶ÇüƶÇüæ¶Çüä¶ÇüŶÇÇâ
Conference on Intelligent Virtual Agents, Lecture Notes in Computer
Science, Vol. 4722, pp. 37-44, 2007.
[9] J. Shi and J. Malik¶ÇÇŶÇÇâ¶Çé│Normalized Cuts and Image Segmentation¶Çé┤¶ÇÇŶÇÇâIEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No.
8, 2000, pp. 888-905.