Abstract: Because of the great advance in multimedia
technology, digital multimedia is vulnerable to malicious
manipulations. In this paper, a public key self-recovery block-based
video authentication technique is proposed which can not only
precisely localize the alteration detection but also recover the missing
data with high reliability. In the proposed block-based technique,
multiple description coding MDC is used to generate two codes (two
descriptions) for each block. Although one block code (one
description) is enough to rebuild the altered block, the altered block
is rebuilt with better quality by the two block descriptions. So using
MDC increases the ratability of recovering data. A block signature is
computed using a cryptographic hash function and a doubly linked
chain is utilized to embed the block signature copies and the block
descriptions into the LSBs of distant blocks and the block itself. The
doubly linked chain scheme gives the proposed technique the
capability to thwart vector quantization attacks. In our proposed
technique , anyone can check the authenticity of a given video using
the public key. The experimental results show that the proposed
technique is reliable for detecting, localizing and recovering the
alterations.
Abstract: The segmentation of mouth and lips is a fundamental
problem in facial image analyisis. In this paper we propose a method
for lip segmentation based on rg-color histogram. Statistical analysis
shows, using the rg-color-space is optimal for this purpose of a pure
color based segmentation. Initially a rough adaptive threshold selects
a histogram region, that assures that all pixels in that region are
skin pixels. Based on that pixels we build a gaussian model which
represents the skin pixels distribution and is utilized to obtain a
refined, optimal threshold. We are not incorporating shape or edge
information. In experiments we show the performance of our lip pixel
segmentation method compared to the ground truth of our dataset and
a conventional watershed algorithm.
Abstract: Real-time hand tracking is a challenging task in many
computer vision applications such as gesture recognition. This paper
proposes a robust method for hand tracking in a complex environment
using Mean-shift analysis and Kalman filter in conjunction with 3D
depth map. The depth information solve the overlapping problem
between hands and face, which is obtained by passive stereo measuring
based on cross correlation and the known calibration data of
the cameras. Mean-shift analysis uses the gradient of Bhattacharyya
coefficient as a similarity function to derive the candidate of the hand
that is most similar to a given hand target model. And then, Kalman
filter is used to estimate the position of the hand target. The results
of hand tracking, tested on various video sequences, are robust to
changes in shape as well as partial occlusion.
Abstract: In this paper, we propose a supervised method for
color image classification based on a multilevel sigmoidal neural
network (MSNN) model. In this method, images are classified into
five categories, i.e., “Car", “Building", “Mountain", “Farm" and
“Coast". This classification is performed without any segmentation
processes. To verify the learning capabilities of the proposed method,
we compare our MSNN model with the traditional Sigmoidal Neural
Network (SNN) model. Results of comparison have shown that the
MSNN model performs better than the traditional SNN model in the
context of training run time and classification rate. Both color
moments and multi-level wavelets decomposition technique are used
to extract features from images. The proposed method has been
tested on a variety of real and synthetic images.
Abstract: This paper proposes a novel stereo vision technique
for top view book scanners which provide us with dense 3d point
clouds of page surfaces. This is a precondition to dewarp bound
volumes independent of 2d information on the page. Our method is
based on algorithms, which normally require the projection of pattern
sequences with structured light. We use image sequences of the
moving stripe lighting of the top view scanner instead of an additional
light projection. Thus the stereo vision setup is simplified without
losing measurement accuracy. Furthermore we improve a surface
model dewarping method through introducing a difference vector
based on real measurements. Although our proposed method is hardly
expensive neither in calculation time nor in hardware requirements
we present good dewarping results even for difficult examples.
Abstract: In this paper, we propose a novel approach for image
segmentation via fuzzification of Rènyi Entropy of Generalized
Distributions (REGD). The fuzzy REGD is used to precisely measure
the structural information of image and to locate the optimal
threshold desired by segmentation. The proposed approach draws
upon the postulation that the optimal threshold concurs with
maximum information content of the distribution. The contributions
in the paper are as follow: Initially, the fuzzy REGD as a measure of
the spatial structure of image is introduced. Then, we propose an
efficient entropic segmentation approach using fuzzy REGD.
However the proposed approach belongs to entropic segmentation
approaches (i.e. these approaches are commonly applied to grayscale
images), it is adapted to be viable for segmenting color images.
Lastly, diverse experiments on real images that show the superior
performance of the proposed method are carried out.
Abstract: It is hard to percept the interaction process with machines when visual information is not available. In this paper, we have addressed this issue to provide interaction through visual techniques. Posture recognition is done for American Sign Language to recognize static alphabets and numbers. 3D information is exploited to obtain segmentation of hands and face using normal Gaussian distribution and depth information. Features for posture recognition are computed using statistical and geometrical properties which are translation, rotation and scale invariant. Hu-Moment as statistical features and; circularity and rectangularity as geometrical features are incorporated to build the feature vectors. These feature vectors are used to train SVM for classification that recognizes static alphabets and numbers. For the alphabets, curvature analysis is carried out to reduce the misclassifications. The experimental results show that proposed system recognizes posture symbols by achieving recognition rate of 98.65% and 98.6% for ASL alphabets and numbers respectively.
Abstract: In modern human computer interaction systems
(HCI), emotion recognition is becoming an imperative characteristic.
The quest for effective and reliable emotion recognition in HCI has
resulted in a need for better face detection, feature extraction and
classification. In this paper we present results of feature space analysis
after briefly explaining our fully automatic vision based emotion
recognition method. We demonstrate the compactness of the feature
space and show how the 2d/3d based method achieves superior features
for the purpose of emotion classification. Also it is exposed that
through feature normalization a widely person independent feature
space is created. As a consequence, the classifier architecture has
only a minor influence on the classification result. This is particularly
elucidated with the help of confusion matrices. For this purpose
advanced classification algorithms, such as Support Vector Machines
and Artificial Neural Networks are employed, as well as the simple k-
Nearest Neighbor classifier.
Abstract: Hand gesture is an active area of research in the vision
community, mainly for the purpose of sign language recognition and
Human Computer Interaction. In this paper, we propose a system to
recognize alphabet characters (A-Z) and numbers (0-9) in real-time
from stereo color image sequences using Hidden Markov Models
(HMMs). Our system is based on three main stages; automatic segmentation
and preprocessing of the hand regions, feature extraction
and classification. In automatic segmentation and preprocessing stage,
color and 3D depth map are used to detect hands where the hand
trajectory will take place in further step using Mean-shift algorithm
and Kalman filter. In the feature extraction stage, 3D combined features
of location, orientation and velocity with respected to Cartesian
systems are used. And then, k-means clustering is employed for
HMMs codeword. The final stage so-called classification, Baum-
Welch algorithm is used to do a full train for HMMs parameters.
The gesture of alphabets and numbers is recognized using Left-Right
Banded model in conjunction with Viterbi algorithm. Experimental
results demonstrate that, our system can successfully recognize hand
gestures with 98.33% recognition rate.
Abstract: Gesture recognition is a challenging task for extracting
meaningful gesture from continuous hand motion. In this paper, we propose an automatic system that recognizes isolated gesture,
in addition meaningful gesture from continuous hand motion for Arabic numbers from 0 to 9 in real-time based on Hidden Markov Models (HMM). In order to handle isolated gesture, HMM using
Ergodic, Left-Right (LR) and Left-Right Banded (LRB) topologies is applied over the discrete vector feature that is extracted from stereo
color image sequences. These topologies are considered to different
number of states ranging from 3 to 10. A new system is developed to recognize the meaningful gesture based on zero-codeword detection
with static velocity motion for continuous gesture. Therefore, the
LRB topology in conjunction with Baum-Welch (BW) algorithm for
training and forward algorithm with Viterbi path for testing presents the best performance. Experimental results show that the proposed system can successfully recognize isolated and meaningful gesture and achieve average rate recognition 98.6% and 94.29% respectively.