View-Point Insensitive Human Pose Recognition using Neural Network and CUDA

Although lots of research work has been done for human pose recognition, the view-point of cameras is still critical problem of overall recognition system. In this paper, view-point insensitive human pose recognition is proposed. The aims of the proposed system are view-point insensitivity and real-time processing. Recognition system consists of feature extraction module, neural network and real-time feed forward calculation. First, histogram-based method is used to extract feature from silhouette image and it is suitable for represent the shape of human pose. To reduce the dimension of feature vector, Principle Component Analysis(PCA) is used. Second, real-time processing is implemented by using Compute Unified Device Architecture(CUDA) and this architecture improves the speed of feed-forward calculation of neural network. We demonstrate the effectiveness of our approach with experiments on real environment.

View-Point Insensitive Human Pose Recognition using Neural Network

This paper proposes view-point insensitive human pose recognition system using neural network. Recognition system consists of silhouette image capturing module, data driven database, and neural network. The advantages of our system are first, it is possible to capture multiple view-point silhouette images of 3D human model automatically. This automatic capture module is helpful to reduce time consuming task of database construction. Second, we develop huge feature database to offer view-point insensitivity at pose recognition. Third, we use neural network to recognize human pose from multiple-view because every pose from each model have similar feature patterns, even though each model has different appearance and view-point. To construct database, we need to create 3D human model using 3D manipulate tools. Contour shape is used to convert silhouette image to feature vector of 12 degree. This extraction task is processed semi-automatically, which benefits in that capturing images and converting to silhouette images from the real capturing environment is needless. We demonstrate the effectiveness of our approach with experiments on virtual environment.

Fast 3D Collision Detection Algorithm using 2D Intersection Area

There are many researches to detect collision between real object and virtual object in 3D space. In general, these techniques are need to huge computing power. So, many research and study are constructed by using cloud computing, network computing, and distribute computing. As a reason of these, this paper proposed a novel fast 3D collision detection algorithm between real and virtual object using 2D intersection area. Proposed algorithm uses 4 multiple cameras and coarse-and-fine method to improve accuracy and speed performance of collision detection. In the coarse step, this system examines the intersection area between real and virtual object silhouettes from all camera views. The result of this step is the index of virtual sensors which has a possibility of collision in 3D space. To decide collision accurately, at the fine step, this system examines the collision detection in 3D space by using the visual hull algorithm. Performance of the algorithm is verified by comparing with existing algorithm. We believe proposed algorithm help many other research, study and application fields such as HCI, augmented reality, intelligent space, and so on.

3D Star Skeleton for Fast Human Posture Representation

In this paper, we propose an improved 3D star skeleton technique, which is a suitable skeletonization for human posture representation and reflects the 3D information of human posture. Moreover, the proposed technique is simple and then can be performed in real-time. The existing skeleton construction techniques, such as distance transformation, Voronoi diagram, and thinning, focus on the precision of skeleton information. Therefore, those techniques are not applicable to real-time posture recognition since they are computationally expensive and highly susceptible to noise of boundary. Although a 2D star skeleton was proposed to complement these problems, it also has some limitations to describe the 3D information of the posture. To represent human posture effectively, the constructed skeleton should consider the 3D information of posture. The proposed 3D star skeleton contains 3D data of human, and focuses on human action and posture recognition. Our 3D star skeleton uses the 8 projection maps which have 2D silhouette information and depth data of human surface. And the extremal points can be extracted as the features of 3D star skeleton, without searching whole boundary of object. Therefore, on execution time, our 3D star skeleton is faster than the “greedy" 3D star skeleton using the whole boundary points on the surface. Moreover, our method can offer more accurate skeleton of posture than the existing star skeleton since the 3D data for the object is concerned. Additionally, we make a codebook, a collection of representative 3D star skeletons about 7 postures, to recognize what posture of constructed skeleton is.

Mouse Pointer Tracking with Eyes

In this article, we expose our research work in Human-machine Interaction. The research consists in manipulating the workspace by eyes. We present some of our results, in particular the detection of eyes and the mouse actions recognition. Indeed, the handicaped user becomes able to interact with the machine in a more intuitive way in diverse applications and contexts. To test our application we have chooses to work in real time on videos captured by a camera placed in front of the user.

Usability Evaluation Framework for Computer Vision Based Interfaces

Human computer interaction has progressed considerably from the traditional modes of interaction. Vision based interfaces are a revolutionary technology, allowing interaction through human actions, gestures. Researchers have developed numerous accurate techniques, however, with an exception to few these techniques are not evaluated using standard HCI techniques. In this paper we present a comprehensive framework to address this issue. Our evaluation of a computer vision application shows that in addition to the accuracy, it is vital to address human factors

Real-time 3D Feature Extraction without Explicit 3D Object Reconstruction

For the communication between human and computer in an interactive computing environment, the gesture recognition is studied vigorously. Therefore, a lot of studies have proposed efficient methods about the recognition algorithm using 2D camera captured images. However, there is a limitation to these methods, such as the extracted features cannot fully represent the object in real world. Although many studies used 3D features instead of 2D features for more accurate gesture recognition, the problem, such as the processing time to generate 3D objects, is still unsolved in related researches. Therefore we propose a method to extract the 3D features combined with the 3D object reconstruction. This method uses the modified GPU-based visual hull generation algorithm which disables unnecessary processes, such as the texture calculation to generate three kinds of 3D projection maps as the 3D feature: a nearest boundary, a farthest boundary, and a thickness of the object projected on the base-plane. In the section of experimental results, we present results of proposed method on eight human postures: T shape, both hands up, right hand up, left hand up, hands front, stand, sit and bend, and compare the computational time of the proposed method with that of the previous methods.

Panoramic Sensor Based Blind Spot Accident Prevention System

There are many automotive accidents due to blind spots and driver inattentiveness. Blind spot is the area that is invisible to the driver's viewpoint without head rotation. Several methods are available for assisting the drivers. Simplest methods are — rear mirrors and wide-angle lenses. But, these methods have a disadvantage of the requirement for human assistance. So, the accuracy of these devices depends on driver. Another approach called an automated approach that makes use of sensors such as sonar or radar. These sensors are used to gather range information. The range information will be processed and used for detecting the collision. The disadvantage of this system is — low angular resolution and limited sensing volumes. This paper is a panoramic sensor based automotive vehicle monitoring..

Enhancing Camera Operator Performance with Computer Vision Based Control

Cameras are often mounted on platforms that canmove like rovers, booms, gantries and aircraft. People operate suchplatforms to capture desired views of scene or target. To avoidcollisions with the environment and occlusions, such platforms oftenpossess redundant degrees-of-freedom. As a result, manipulatingsuch platforms demands much skill. Visual-servoing some degrees-of-freedom may reduce operator burden and improve tracking per-formance. This concept, which we call human-in-the-loop visual-servoing, is demonstrated in this paper and applies a Α-β-γ filter and feedforward controller to a broadcast camera boom.

In Search of Robustness and Efficiency via l1− and l2− Regularized Optimization for Physiological Motion Compensation

Compensating physiological motion in the context of minimally invasive cardiac surgery has become an attractive issue since it outperforms traditional cardiac procedures offering remarkable benefits. Owing to space restrictions, computer vision techniques have proven to be the most practical and suitable solution. However, the lack of robustness and efficiency of existing methods make physiological motion compensation an open and challenging problem. This work focusses on increasing robustness and efficiency via exploration of the classes of 1−and 2−regularized optimization, emphasizing the use of explicit regularization. Both approaches are based on natural features of the heart using intensity information. Results pointed out the 1−regularized optimization class as the best since it offered the shortest computational cost, the smallest average error and it proved to work even under complex deformations.

A Cooperative Multi-Robot Control Using Ad Hoc Wireless Network

In this paper, a Cooperative Multi-robot for Carrying Targets (CMCT) algorithm is proposed. The multi-robot team consists of three robots, one is a supervisor and the others are workers for carrying boxes in a store of 100×100 m2. Each robot has a self recharging mechanism. The CMCT minimizes robot-s worked time for carrying many boxes during day by working in parallel. That is, the supervisor detects the required variables in the same time another robots work with previous variables. It works with straightforward mechanical models by using simple cosine laws. It detects the robot-s shortest path for reaching the target position avoiding obstacles by using a proposed CMCT path planning (CMCT-PP) algorithm. It prevents the collision between robots during moving. The robots interact in an ad hoc wireless network. Simulation results show that the proposed system that consists of CMCT algorithm and its accomplished CMCT-PP algorithm achieves a high improvement in time and distance while performing the required tasks over the already existed algorithms.

Multi-View Neural Network Based Gait Recognition

Human identification at a distance has recently gained growing interest from computer vision researchers. Gait recognition aims essentially to address this problem by identifying people based on the way they walk [1]. Gait recognition has 3 steps. The first step is preprocessing, the second step is feature extraction and the third one is classification. This paper focuses on the classification step that is essential to increase the CCR (Correct Classification Rate). Multilayer Perceptron (MLP) is used in this work. Neural Networks imitate the human brain to perform intelligent tasks [3].They can represent complicated relationships between input and output and acquire knowledge about these relationships directly from the data [2]. In this paper we apply MLP NN for 11 views in our database and compare the CCR values for these views. Experiments are performed with the NLPR databases, and the effectiveness of the proposed method for gait recognition is demonstrated.

Hand Gesture Recognition Based on Combined Features Extraction

Hand gesture is an active area of research in the vision community, mainly for the purpose of sign language recognition and Human Computer Interaction. In this paper, we propose a system to recognize alphabet characters (A-Z) and numbers (0-9) in real-time from stereo color image sequences using Hidden Markov Models (HMMs). Our system is based on three main stages; automatic segmentation and preprocessing of the hand regions, feature extraction and classification. In automatic segmentation and preprocessing stage, color and 3D depth map are used to detect hands where the hand trajectory will take place in further step using Mean-shift algorithm and Kalman filter. In the feature extraction stage, 3D combined features of location, orientation and velocity with respected to Cartesian systems are used. And then, k-means clustering is employed for HMMs codeword. The final stage so-called classification, Baum- Welch algorithm is used to do a full train for HMMs parameters. The gesture of alphabets and numbers is recognized using Left-Right Banded model in conjunction with Viterbi algorithm. Experimental results demonstrate that, our system can successfully recognize hand gestures with 98.33% recognition rate.

Hand Gesture Recognition using Blob Detection for Immersive Projection Display System

We developed a vision interface immersive projection system, CAVE in virtual rea using hand gesture recognition with computer vis background image was subtracted from current webcam and we convert the color space of the imag Then we mask skin regions using skin color range t a noise reduction operation. We made blobs fro gestures were recognized using these blobs. Using recognition, we could implement an effective bothering devices for CAVE. e framework for an reality research field vision techniques. ent image frame age into HSV space. e threshold and apply from the image and ing our hand gesture e interface without

Using Mean-Shift Tracking Algorithms for Real-Time Tracking of Moving Images on an Autonomous Vehicle Testbed Platform

This paper describes new computer vision algorithms that have been developed to track moving objects as part of a long-term study into the design of (semi-)autonomous vehicles. We present the results of a study to exploit variable kernels for tracking in video sequences. The basis of our work is the mean shift object-tracking algorithm; for a moving target, it is usual to define a rectangular target window in an initial frame, and then process the data within that window to separate the tracked object from the background by the mean shift segmentation algorithm. Rather than use the standard, Epanechnikov kernel, we have used a kernel weighted by the Chamfer distance transform to improve the accuracy of target representation and localization, minimising the distance between the two distributions in RGB color space using the Bhattacharyya coefficient. Experimental results show the improved tracking capability and versatility of the algorithm in comparison with results using the standard kernel. These algorithms are incorporated as part of a robot test-bed architecture which has been used to demonstrate their effectiveness.

Efficient Block Matching Algorithm for Motion Estimation

Motion estimation is a key problem in video processing and computer vision. Optical flow motion estimation can achieve high estimation accuracy when motion vector is small. Three-step search algorithm can handle large motion vector but not very accurate. A joint algorithm was proposed in this paper to achieve high estimation accuracy disregarding whether the motion vector is small or large, and keep the computation cost much lower than full search.

Edge Detection in Digital Images Using Fuzzy Logic Technique

The fuzzy technique is an operator introduced in order to simulate at a mathematical level the compensatory behavior in process of decision making or subjective evaluation. The following paper introduces such operators on hand of computer vision application. In this paper a novel method based on fuzzy logic reasoning strategy is proposed for edge detection in digital images without determining the threshold value. The proposed approach begins by segmenting the images into regions using floating 3x3 binary matrix. The edge pixels are mapped to a range of values distinct from each other. The robustness of the proposed method results for different captured images are compared to those obtained with the linear Sobel operator. It is gave a permanent effect in the lines smoothness and straightness for the straight lines and good roundness for the curved lines. In the same time the corners get sharper and can be defined easily.

A Hidden Markov Model-Based Isolated and Meaningful Hand Gesture Recognition

Gesture recognition is a challenging task for extracting meaningful gesture from continuous hand motion. In this paper, we propose an automatic system that recognizes isolated gesture, in addition meaningful gesture from continuous hand motion for Arabic numbers from 0 to 9 in real-time based on Hidden Markov Models (HMM). In order to handle isolated gesture, HMM using Ergodic, Left-Right (LR) and Left-Right Banded (LRB) topologies is applied over the discrete vector feature that is extracted from stereo color image sequences. These topologies are considered to different number of states ranging from 3 to 10. A new system is developed to recognize the meaningful gesture based on zero-codeword detection with static velocity motion for continuous gesture. Therefore, the LRB topology in conjunction with Baum-Welch (BW) algorithm for training and forward algorithm with Viterbi path for testing presents the best performance. Experimental results show that the proposed system can successfully recognize isolated and meaningful gesture and achieve average rate recognition 98.6% and 94.29% respectively.

Tracking Objects in Color Image Sequences: Application to Football Images

In this paper, we present a comparative study between two computer vision systems for objects recognition and tracking, these algorithms describe two different approach based on regions constituted by a set of pixels which parameterized objects in shot sequences. For the image segmentation and objects detection, the FCM technique is used, the overlapping between cluster's distribution is minimized by the use of suitable color space (other that the RGB one). The first technique takes into account a priori probabilities governing the computation of various clusters to track objects. A Parzen kernel method is described and allows identifying the players in each frame, we also show the importance of standard deviation value research of the Gaussian probability density function. Region matching is carried out by an algorithm that operates on the Mahalanobis distance between region descriptors in two subsequent frames and uses singular value decomposition to compute a set of correspondences satisfying both the principle of proximity and the principle of exclusion.

Partial 3D Reconstruction using Evolutionary Algorithms

When reconstructing a scenario, it is necessary to know the structure of the elements present on the scene to have an interpretation. In this work we link 3D scenes reconstruction to evolutionary algorithms through the vision stereo theory. We consider vision stereo as a method that provides the reconstruction of a scene using only a couple of images of the scene and performing some computation. Through several images of a scene, captured from different positions, vision stereo can give us an idea about the threedimensional characteristics of the world. Vision stereo usually requires of two cameras, making an analogy to the mammalian vision system. In this work we employ only a camera, which is translated along a path, capturing images every certain distance. As we can not perform all computations required for an exhaustive reconstruction, we employ an evolutionary algorithm to partially reconstruct the scene in real time. The algorithm employed is the fly algorithm, which employ “flies" to reconstruct the principal characteristics of the world following certain evolutionary rules.