Abstract: Motion capture devices have been utilized in
producing several contents, such as movies and video games. However,
since motion capture devices are expensive and inconvenient to use,
motions segmented from captured data was recycled and synthesized
to utilize it in another contents, but the motions were generally
segmented by contents producers in manual. Therefore, automatic
motion segmentation is recently getting a lot of attentions. Previous
approaches are divided into on-line and off-line, where on-line
approaches segment motions based on similarities between
neighboring frames and off-line approaches segment motions by
capturing the global characteristics in feature space. In this paper, we
propose a graph-based high-level motion segmentation method. Since
high-level motions consist of several repeated frames within temporal
distances, we consider all similarities among all frames within the
temporal distance. This is achieved by constructing a graph, where
each vertex represents a frame and the edges between the frames are
weighted by their similarity. Then, normalized cuts algorithm is used
to partition the constructed graph into several sub-graphs by globally
finding minimum cuts. In the experiments, the results using the
proposed method showed better performance than PCA-based method
in on-line and GMM-based method in off-line, as the proposed method
globally segment motions from the graph constructed based
similarities between neighboring frames as well as similarities among
all frames within temporal distances.
Abstract: Finger spelling is an art of communicating by signs
made with fingers, and has been introduced into sign language to serve
as a bridge between the sign language and the verbal language.
Previous approaches to finger spelling recognition are classified into
two categories: glove-based and vision-based approaches. The
glove-based approach is simpler and more accurate recognizing work
of hand posture than vision-based, yet the interfaces require the user to
wear a cumbersome and carry a load of cables that connected the
device to a computer. In contrast, the vision-based approaches provide
an attractive alternative to the cumbersome interface, and promise
more natural and unobtrusive human-computer interaction. The
vision-based approaches generally consist of two steps: hand
extraction and recognition, and two steps are processed independently.
This paper proposes real-time vision-based Korean finger spelling
recognition system by integrating hand extraction into recognition.
First, we tentatively detect a hand region using CAMShift algorithm.
Then fill factor and aspect ratio estimated by width and height
estimated by CAMShift are used to choose candidate from database,
which can reduce the number of matching in recognition step. To
recognize the finger spelling, we use DTW(dynamic time warping)
based on modified chain codes, to be robust to scale and orientation
variations. In this procedure, since accurate hand regions, without
holes and noises, should be extracted to improve the precision, we use
graph cuts algorithm that globally minimize the energy function
elegantly expressed by Markov random fields (MRFs). In the
experiments, the computational times are less than 130ms, and the
times are not related to the number of templates of finger spellings in
database, as candidate templates are selected in extraction step.