Abstract: Although lots of research work has been done for
human pose recognition, the view-point of cameras is still critical
problem of overall recognition system. In this paper, view-point
insensitive human pose recognition is proposed. The aims of the
proposed system are view-point insensitivity and real-time processing.
Recognition system consists of feature extraction module, neural
network and real-time feed forward calculation. First, histogram-based
method is used to extract feature from silhouette image and it is
suitable for represent the shape of human pose. To reduce the
dimension of feature vector, Principle Component Analysis(PCA) is
used. Second, real-time processing is implemented by using Compute
Unified Device Architecture(CUDA) and this architecture improves
the speed of feed-forward calculation of neural network. We
demonstrate the effectiveness of our approach with experiments on
real environment.
Abstract: This paper proposes view-point insensitive human
pose recognition system using neural network. Recognition system
consists of silhouette image capturing module, data driven database,
and neural network. The advantages of our system are first, it is
possible to capture multiple view-point silhouette images of 3D human
model automatically. This automatic capture module is helpful to
reduce time consuming task of database construction. Second, we
develop huge feature database to offer view-point insensitivity at pose
recognition. Third, we use neural network to recognize human pose
from multiple-view because every pose from each model have similar
feature patterns, even though each model has different appearance and
view-point. To construct database, we need to create 3D human model
using 3D manipulate tools. Contour shape is used to convert silhouette
image to feature vector of 12 degree. This extraction task is processed
semi-automatically, which benefits in that capturing images and
converting to silhouette images from the real capturing environment is
needless. We demonstrate the effectiveness of our approach with
experiments on virtual environment.
Abstract: There are many researches to detect collision between real object and virtual object in 3D space. In general, these techniques are need to huge computing power. So, many research and study are constructed by using cloud computing, network computing, and distribute computing. As a reason of these, this paper proposed a novel fast 3D collision detection algorithm between real and virtual object using 2D intersection area. Proposed algorithm uses 4 multiple cameras and coarse-and-fine method to improve accuracy and speed performance of collision detection. In the coarse step, this system examines the intersection area between real and virtual object silhouettes from all camera views. The result of this step is the index of virtual sensors which has a possibility of collision in 3D space. To decide collision accurately, at the fine step, this system examines the collision detection in 3D space by using the visual hull algorithm. Performance of the algorithm is verified by comparing with existing algorithm. We believe proposed algorithm help many other research, study and application fields such as HCI, augmented reality, intelligent space, and so on.
Abstract: In this paper, we propose an improved 3D star skeleton
technique, which is a suitable skeletonization for human posture representation
and reflects the 3D information of human posture.
Moreover, the proposed technique is simple and then can be performed
in real-time. The existing skeleton construction techniques, such as
distance transformation, Voronoi diagram, and thinning, focus on the
precision of skeleton information. Therefore, those techniques are not
applicable to real-time posture recognition since they are computationally
expensive and highly susceptible to noise of boundary. Although
a 2D star skeleton was proposed to complement these problems,
it also has some limitations to describe the 3D information of the
posture. To represent human posture effectively, the constructed skeleton
should consider the 3D information of posture. The proposed 3D
star skeleton contains 3D data of human, and focuses on human action
and posture recognition. Our 3D star skeleton uses the 8 projection
maps which have 2D silhouette information and depth data of human
surface. And the extremal points can be extracted as the features of 3D
star skeleton, without searching whole boundary of object. Therefore,
on execution time, our 3D star skeleton is faster than the “greedy" 3D
star skeleton using the whole boundary points on the surface. Moreover,
our method can offer more accurate skeleton of posture than the
existing star skeleton since the 3D data for the object is concerned.
Additionally, we make a codebook, a collection of representative 3D
star skeletons about 7 postures, to recognize what posture of constructed
skeleton is.
Abstract: In this article, we expose our research work in
Human-machine Interaction. The research consists in manipulating
the workspace by eyes. We present some of our results, in particular
the detection of eyes and the mouse actions recognition. Indeed, the
handicaped user becomes able to interact with the machine in a more
intuitive way in diverse applications and contexts. To test our
application we have chooses to work in real time on videos captured
by a camera placed in front of the user.
Abstract: Human computer interaction has progressed
considerably from the traditional modes of interaction. Vision based
interfaces are a revolutionary technology, allowing interaction
through human actions, gestures. Researchers have developed
numerous accurate techniques, however, with an exception to few
these techniques are not evaluated using standard HCI techniques. In
this paper we present a comprehensive framework to address this
issue. Our evaluation of a computer vision application shows that in
addition to the accuracy, it is vital to address human factors
Abstract: For the communication between human and computer
in an interactive computing environment, the gesture recognition is
studied vigorously. Therefore, a lot of studies have proposed efficient
methods about the recognition algorithm using 2D camera captured
images. However, there is a limitation to these methods, such as the
extracted features cannot fully represent the object in real world.
Although many studies used 3D features instead of 2D features for
more accurate gesture recognition, the problem, such as the processing
time to generate 3D objects, is still unsolved in related researches.
Therefore we propose a method to extract the 3D features combined
with the 3D object reconstruction. This method uses the modified
GPU-based visual hull generation algorithm which disables unnecessary
processes, such as the texture calculation to generate three kinds
of 3D projection maps as the 3D feature: a nearest boundary, a farthest
boundary, and a thickness of the object projected on the base-plane. In
the section of experimental results, we present results of proposed
method on eight human postures: T shape, both hands up, right hand
up, left hand up, hands front, stand, sit and bend, and compare the
computational time of the proposed method with that of the previous
methods.
Abstract: There are many automotive accidents due to blind spots and driver inattentiveness. Blind spot is the area that is invisible to the driver's viewpoint without head rotation. Several methods are available for assisting the drivers. Simplest methods are — rear mirrors and wide-angle lenses. But, these methods have a disadvantage of the requirement for human assistance. So, the accuracy of these devices depends on driver. Another approach called an automated approach that makes use of sensors such as sonar or radar. These sensors are used to gather range information. The range information will be processed and used for detecting the collision. The disadvantage of this system is — low angular resolution and limited sensing volumes. This paper is a panoramic sensor based automotive vehicle monitoring..
Abstract: Cameras are often mounted on platforms that canmove like rovers, booms, gantries and aircraft. People operate suchplatforms to capture desired views of scene or target. To avoidcollisions with the environment and occlusions, such platforms oftenpossess redundant degrees-of-freedom. As a result, manipulatingsuch platforms demands much skill. Visual-servoing some degrees-of-freedom may reduce operator burden and improve tracking per-formance. This concept, which we call human-in-the-loop visual-servoing, is demonstrated in this paper and applies a Α-β-γ filter and feedforward controller to a broadcast camera boom.
Abstract: Compensating physiological motion in the context
of minimally invasive cardiac surgery has become an attractive
issue since it outperforms traditional cardiac procedures offering
remarkable benefits. Owing to space restrictions, computer vision
techniques have proven to be the most practical and suitable solution.
However, the lack of robustness and efficiency of existing methods
make physiological motion compensation an open and challenging
problem. This work focusses on increasing robustness and efficiency
via exploration of the classes of 1−and 2−regularized optimization,
emphasizing the use of explicit regularization. Both approaches are
based on natural features of the heart using intensity information.
Results pointed out the 1−regularized optimization class as the best
since it offered the shortest computational cost, the smallest average
error and it proved to work even under complex deformations.
Abstract: In this paper, a Cooperative Multi-robot for Carrying
Targets (CMCT) algorithm is proposed. The multi-robot team
consists of three robots, one is a supervisor and the others are
workers for carrying boxes in a store of 100×100 m2. Each robot has
a self recharging mechanism. The CMCT minimizes robot-s worked
time for carrying many boxes during day by working in parallel. That
is, the supervisor detects the required variables in the same time
another robots work with previous variables. It works with
straightforward mechanical models by using simple cosine laws. It
detects the robot-s shortest path for reaching the target position
avoiding obstacles by using a proposed CMCT path planning
(CMCT-PP) algorithm. It prevents the collision between robots
during moving. The robots interact in an ad hoc wireless network.
Simulation results show that the proposed system that consists of
CMCT algorithm and its accomplished CMCT-PP algorithm
achieves a high improvement in time and distance while performing
the required tasks over the already existed algorithms.
Abstract: Human identification at a distance has recently gained
growing interest from computer vision researchers. Gait recognition
aims essentially to address this problem by identifying people based
on the way they walk [1]. Gait recognition has 3 steps. The first step
is preprocessing, the second step is feature extraction and the third
one is classification. This paper focuses on the classification step that
is essential to increase the CCR (Correct Classification Rate).
Multilayer Perceptron (MLP) is used in this work. Neural Networks
imitate the human brain to perform intelligent tasks [3].They can
represent complicated relationships between input and output and
acquire knowledge about these relationships directly from the data
[2]. In this paper we apply MLP NN for 11 views in our database and
compare the CCR values for these views. Experiments are performed
with the NLPR databases, and the effectiveness of the proposed
method for gait recognition is demonstrated.
Abstract: Hand gesture is an active area of research in the vision
community, mainly for the purpose of sign language recognition and
Human Computer Interaction. In this paper, we propose a system to
recognize alphabet characters (A-Z) and numbers (0-9) in real-time
from stereo color image sequences using Hidden Markov Models
(HMMs). Our system is based on three main stages; automatic segmentation
and preprocessing of the hand regions, feature extraction
and classification. In automatic segmentation and preprocessing stage,
color and 3D depth map are used to detect hands where the hand
trajectory will take place in further step using Mean-shift algorithm
and Kalman filter. In the feature extraction stage, 3D combined features
of location, orientation and velocity with respected to Cartesian
systems are used. And then, k-means clustering is employed for
HMMs codeword. The final stage so-called classification, Baum-
Welch algorithm is used to do a full train for HMMs parameters.
The gesture of alphabets and numbers is recognized using Left-Right
Banded model in conjunction with Viterbi algorithm. Experimental
results demonstrate that, our system can successfully recognize hand
gestures with 98.33% recognition rate.
Abstract: We developed a vision interface immersive projection system, CAVE in virtual rea using hand gesture recognition with computer vis background image was subtracted from current webcam and we convert the color space of the imag Then we mask skin regions using skin color range t a noise reduction operation. We made blobs fro gestures were recognized using these blobs. Using recognition, we could implement an effective bothering devices for CAVE. e framework for an reality research field vision techniques. ent image frame age into HSV space. e threshold and apply from the image and ing our hand gesture e interface without
Abstract: This paper describes new computer vision algorithms
that have been developed to track moving objects as part of a
long-term study into the design of (semi-)autonomous vehicles. We
present the results of a study to exploit variable kernels for tracking in
video sequences. The basis of our work is the mean shift
object-tracking algorithm; for a moving target, it is usual to define a
rectangular target window in an initial frame, and then process the data
within that window to separate the tracked object from the background
by the mean shift segmentation algorithm. Rather than use the
standard, Epanechnikov kernel, we have used a kernel weighted by the
Chamfer distance transform to improve the accuracy of target
representation and localization, minimising the distance between the
two distributions in RGB color space using the Bhattacharyya
coefficient. Experimental results show the improved tracking
capability and versatility of the algorithm in comparison with results
using the standard kernel. These algorithms are incorporated as part of
a robot test-bed architecture which has been used to demonstrate their
effectiveness.
Abstract: Motion estimation is a key problem in video
processing and computer vision. Optical flow motion estimation can
achieve high estimation accuracy when motion vector is small.
Three-step search algorithm can handle large motion vector but not
very accurate. A joint algorithm was proposed in this paper to
achieve high estimation accuracy disregarding whether the motion
vector is small or large, and keep the computation cost much lower
than full search.
Abstract: The fuzzy technique is an operator introduced in order
to simulate at a mathematical level the compensatory behavior in
process of decision making or subjective evaluation. The following
paper introduces such operators on hand of computer vision
application.
In this paper a novel method based on fuzzy logic reasoning
strategy is proposed for edge detection in digital images without
determining the threshold value. The proposed approach begins by
segmenting the images into regions using floating 3x3 binary matrix.
The edge pixels are mapped to a range of values distinct from each
other. The robustness of the proposed method results for different
captured images are compared to those obtained with the linear Sobel
operator. It is gave a permanent effect in the lines smoothness and
straightness for the straight lines and good roundness for the curved
lines. In the same time the corners get sharper and can be defined
easily.
Abstract: Gesture recognition is a challenging task for extracting
meaningful gesture from continuous hand motion. In this paper, we propose an automatic system that recognizes isolated gesture,
in addition meaningful gesture from continuous hand motion for Arabic numbers from 0 to 9 in real-time based on Hidden Markov Models (HMM). In order to handle isolated gesture, HMM using
Ergodic, Left-Right (LR) and Left-Right Banded (LRB) topologies is applied over the discrete vector feature that is extracted from stereo
color image sequences. These topologies are considered to different
number of states ranging from 3 to 10. A new system is developed to recognize the meaningful gesture based on zero-codeword detection
with static velocity motion for continuous gesture. Therefore, the
LRB topology in conjunction with Baum-Welch (BW) algorithm for
training and forward algorithm with Viterbi path for testing presents the best performance. Experimental results show that the proposed system can successfully recognize isolated and meaningful gesture and achieve average rate recognition 98.6% and 94.29% respectively.
Abstract: In this paper, we present a comparative study between two computer vision systems for objects recognition and tracking, these algorithms describe two different approach based on regions constituted by a set of pixels which parameterized objects in shot sequences. For the image segmentation and objects detection, the FCM technique is used, the overlapping between cluster's distribution is minimized by the use of suitable color space (other that the RGB one). The first technique takes into account a priori probabilities governing the computation of various clusters to track objects. A Parzen kernel method is described and allows identifying the players in each frame, we also show the importance of standard deviation value research of the Gaussian probability density function. Region matching is carried out by an algorithm that operates on the Mahalanobis distance between region descriptors in two subsequent frames and uses singular value decomposition to compute a set of correspondences satisfying both the principle of proximity and the principle of exclusion.
Abstract: When reconstructing a scenario, it is necessary to
know the structure of the elements present on the scene to have an
interpretation. In this work we link 3D scenes reconstruction to
evolutionary algorithms through the vision stereo theory. We
consider vision stereo as a method that provides the reconstruction of
a scene using only a couple of images of the scene and performing
some computation. Through several images of a scene, captured from
different positions, vision stereo can give us an idea about the threedimensional
characteristics of the world. Vision stereo usually
requires of two cameras, making an analogy to the mammalian vision
system. In this work we employ only a camera, which is translated
along a path, capturing images every certain distance. As we can not
perform all computations required for an exhaustive reconstruction,
we employ an evolutionary algorithm to partially reconstruct the
scene in real time. The algorithm employed is the fly algorithm,
which employ “flies" to reconstruct the principal characteristics of
the world following certain evolutionary rules.