Abstract: As a rapid growth of digital videos and data
communications, video summarization that provides a shorter version
of the video for fast video browsing and retrieval is necessary.
Key frame extraction is one of the mechanisms to generate video
summary. In general, the extracted key frames should both represent
the entire video content and contain minimum redundancy. However,
most of the existing approaches heuristically select key frames; hence,
the selected key frames may not be the most different frames and/or
not cover the entire content of a video. In this paper, we propose
a method of video summarization which provides the reasonable
objective functions for selecting key frames. In particular, we apply
a statistical dependency measure called quadratic mutual informaion
as our objective functions for maximizing the coverage of the
entire video content as well as minimizing the redundancy among
selected key frames. The proposed key frame extraction algorithm
finds key frames as an optimization problem. Through experiments,
we demonstrate the success of the proposed video summarization
approach that produces video summary with better coverage of
the entire video content while less redundancy among key frames
comparing to the state-of-the-art approaches.
Abstract: Key frame extraction methods select the most
representative frames of a video, which can be used in different areas
of video processing such as video retrieval, video summary, and video
indexing. In this paper we present a novel approach for extracting key
frames from video sequences. The frame is characterized uniquely by
his contours which are represented by the dominant blocks. These
dominant blocks are located on the contours and its near textures.
When the video frames have a noticeable changement, its dominant
blocks changed, then we can extracte a key frame. The dominant
blocks of every frame is computed, and then feature vectors are
extracted from the dominant blocks image of each frame and arranged
in a feature matrix. Singular Value Decomposition is used to calculate
sliding windows ranks of those matrices. Finally the computed ranks
are traced and then we are able to extract key frames of a video.
Experimental results show that the proposed approach is robust
against a large range of digital effects used during shot transition.
Abstract: The use of human hand as a natural interface for humancomputer interaction (HCI) serves as the motivation for research in hand gesture recognition. Vision-based hand gesture recognition involves visual analysis of hand shape, position and/or movement. In this paper, we use the concept of object-based video abstraction for segmenting the frames into video object planes (VOPs), as used in MPEG-4, with each VOP corresponding to one semantically meaningful hand position. Next, the key VOPs are selected on the basis of the amount of change in hand shape – for a given key frame in the sequence the next key frame is the one in which the hand changes its shape significantly. Thus, an entire video clip is transformed into a small number of representative frames that are sufficient to represent a gesture sequence. Subsequently, we model a particular gesture as a sequence of key frames each bearing information about its duration. These constitute a finite state machine. For recognition, the states of the incoming gesture sequence are matched with the states of all different FSMs contained in the database of gesture vocabulary. The core idea of our proposed representation is that redundant frames of the gesture video sequence bear only the temporal information of a gesture and hence discarded for computational efficiency. Experimental results obtained demonstrate the effectiveness of our proposed scheme for key frame extraction, subsequent gesture summarization and finally gesture recognition.