Abstract: In this paper, we present the human action recognition method using the variational Bayesian HMM with the Dirichlet process mixture (DPM) of the Gaussian-Wishart emission model (GWEM). First, we define the Bayesian HMM based on the Dirichlet process, which allows an infinite number of Gaussian-Wishart components to support continuous emission observations. Second, we have considered an efficient variational Bayesian inference method that can be applied to drive the posterior distribution of hidden variables and model parameters for the proposed model based on training data. And then we have derived the predictive distribution that may be used to classify new action. Third, the paper proposes a process of extracting appropriate spatial-temporal feature vectors that can be used to recognize a wide range of human behaviors from input video image. Finally, we have conducted experiments that can evaluate the performance of the proposed method. The experimental results show that the method presented is more efficient with human action recognition than existing methods.
Abstract: Human action is recognized directly from the video sequences. The objective of this work is to recognize various human actions like run, jump, walk etc. Human action recognition requires some prior knowledge about actions namely, the motion estimation, foreground and background estimation. Region of interest (ROI) is extracted to identify the human in the frame. Then, optical flow technique is used to extract the motion vectors. Using the extracted features similarity measure based classification is done to recognize the action. From experimentations upon the Weizmann database, it is found that the proposed method offers a high accuracy.
Abstract: Recognizing human action from videos is an active
field of research in computer vision and pattern recognition. Human
activity recognition has many potential applications such as video
surveillance, human machine interaction, sport videos retrieval and
robot navigation. Actually, local descriptors and bag of visuals words
models achieve state-of-the-art performance for human action
recognition. The main challenge in features description is how to
represent efficiently the local motion information. Most of the
previous works focus on the extension of 2D local descriptors on 3D
ones to describe local information around every interest point. In this
paper, we propose a new spatio-temporal descriptor based on a spacetime
description of moving points. Our description is focused on an
Accordion representation of video which is well-suited to recognize
human action from 2D local descriptors without the need to 3D
extensions. We use the bag of words approach to represent videos.
We quantify 2D local descriptor describing both temporal and spatial
features with a good compromise between computational complexity
and action recognition rates. We have reached impressive results on
publicly available action data set
Abstract: In this paper we present a system for classifying videos
by frequency spectra. Many videos contain activities with repeating
movements. Sports videos, home improvement videos, or videos
showing mechanical motion are some example areas. Motion of these
areas usually repeats with a certain main frequency and several side
frequencies. Transforming repeating motion to its frequency domain
via FFT reveals these frequencies. Average amplitudes of frequency
intervals can be seen as features of cyclic motion. Hence determining
these features can help to classify videos with repeating movements.
In this paper we explain how to compute frequency spectra for video
clips and how to use them for classifying. Our approach utilizes series
of image moments as a function. This function again is transformed
into its frequency domain.
Abstract: Imitation learning is considered to be an effective way of teaching humanoid robots and action recognition is the key step to imitation learning. In this paper an online algorithm to recognize
parametric actions with object context is presented. Objects are key instruments in understanding an action when there is uncertainty.
Ambiguities arising in similar actions can be resolved with objectn context. We classify actions according to the changes they make to
the object space. Actions that produce the same state change in the object movement space are classified to belong to the same class. This allow us to define several classes of actions where members of
each class are connected with a semantic interpretation.
Abstract: Naive Bayes Nearest Neighbor (NBNN) and its variants, i,e., local NBNN and the NBNN kernels, are local feature-based classifiers that have achieved impressive performance in image classification. By exploiting instance-to-class (I2C) distances (instance means image/video in image/video classification), they avoid quantization errors of local image descriptors in the bag of words (BoW) model. However, the performances of NBNN, local NBNN and the NBNN kernels have not been validated on video analysis. In this paper, we introduce these three classifiers into human action recognition and conduct comprehensive experiments on the benchmark KTH and the realistic HMDB datasets. The results shows that those I2C based classifiers consistently outperform the SVM classifier with the BoW model.
Abstract: In this paper the use of sequential machines for recognizing actions taken by the objects detected by a general tracking algorithm is proposed. The system may deal with the uncertainty inherent in medium-level vision data. For this purpose, fuzzification of input data is performed. Besides, this transformation allows to manage data independently of the tracking application selected and enables adding characteristics of the analyzed scenario. The representation of actions by means of an automaton and the generation of the input symbols for finite automaton depending on the object and action compared are described. The output of the comparison process between an object and an action is a numerical value that represents the membership of the object to the action. This value is computed depending on how similar the object and the action are. The work concludes with the application of the proposed technique to identify the behavior of vehicles in road traffic scenes.
Abstract: This paper proposes new hybrid approaches for face
recognition. Gabor wavelets representation of face images is an
effective approach for both facial action recognition and face
identification. Perform dimensionality reduction and linear
discriminate analysis on the down sampled Gabor wavelet faces can
increase the discriminate ability. Nearest feature space is extended to
various similarity measures. In our experiments, proposed Gabor
wavelet faces combined with extended neural net feature space
classifier shows very good performance, which can achieve 93 %
maximum correct recognition rate on ORL data set without any preprocessing
step.
Abstract: In this paper, a novel algorithm based on Ridgelet
Transform and support vector machine is proposed for human action
recognition. The Ridgelet transform is a directional multi-resolution
transform and it is more suitable for describing the human action by
performing its directional information to form spatial features
vectors. The dynamic transition between the spatial features is carried
out using both the Principal Component Analysis and clustering
algorithm K-means. First, the Principal Component Analysis is used
to reduce the dimensionality of the obtained vectors. Then, the kmeans
algorithm is then used to perform the obtained vectors to form
the spatio-temporal pattern, called set-of-labels, according to given
periodicity of human action. Finally, a Support Machine classifier is
used to discriminate between the different human actions. Different
tests are conducted on popular Datasets, such as Weizmann and
KTH. The obtained results show that the proposed method provides
more significant accuracy rate and it drives more robustness in very
challenging situations such as lighting changes, scaling and dynamic
environment
Abstract: In this paper we illuminate a frequency domain based
classification method for video scenes. Videos from certain topical
areas often contain activities with repeating movements. Sports
videos, home improvement videos, or videos showing mechanical
motion are some example areas. Assessing main and side frequencies
of each repeating movement gives rise to the motion type. We
obtain the frequency domain by transforming spatio-temporal motion
trajectories. Further on we explain how to compute frequency features
for video clips and how to use them for classifying. The focus of
the experimental phase is on transforms utilized for our system.
By comparing various transforms, experiments show the optimal
transform for a motion frequency based approach.