Abstract: During the past several years, face recognition in video
has received significant attention. Not only the wide range of
commercial and law enforcement applications, but also the availability
of feasible technologies after several decades of research contributes
to the trend. Although current face recognition systems have reached a
certain level of maturity, their development is still limited by the
conditions brought about by many real applications. For example,
recognition images of video sequence acquired in an open
environment with changes in illumination and/or pose and/or facial
occlusion and/or low resolution of acquired image remains a largely
unsolved problem. In other words, current algorithms are yet to be
developed. This paper provides an up-to-date survey of video-based
face recognition research. To present a comprehensive survey, we
categorize existing video based recognition approaches and present
detailed descriptions of representative methods within each category.
In addition, relevant topics such as real time detection, real time
tracking for video, issues such as illumination, pose, 3D and low
resolution are covered.
Abstract: People detection from images has a variety of applications such as video surveillance and driver assistance system, but is still a challenging task and more difficult in crowded environments such as shopping malls in which occlusion of lower parts of human body often occurs. Lack of the full-body information requires more effective features than common features such as HOG. In this paper, new features are introduced that exploits global self-symmetry (GSS) characteristic in head-shoulder patterns. The features encode the similarity or difference of color histograms and oriented gradient histograms between two vertically symmetric blocks. The domain-specific features are rapid to compute from the integral images in Viola-Jones cascade-of-rejecters framework. The proposed features are evaluated with our own head-shoulder dataset that, in part, consists of a well-known INRIA pedestrian dataset. Experimental results show that the GSS features are effective in reduction of false alarmsmarginally and the gradient GSS features are preferred more often than the color GSS ones in the feature selection.
Abstract: A novel feature selection strategy to improve the recognition accuracy on the faces that are affected due to nonuniform illumination, partial occlusions and varying expressions is proposed in this paper. This technique is applicable especially in scenarios where the possibility of obtaining a reliable intra-class probability distribution is minimal due to fewer numbers of training samples. Phase congruency features in an image are defined as the points where the Fourier components of that image are maximally inphase. These features are invariant to brightness and contrast of the image under consideration. This property allows to achieve the goal of lighting invariant face recognition. Phase congruency maps of the training samples are generated and a novel modular feature selection strategy is implemented. Smaller sub regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are arranged in the order of increasing distance between the sub regions involved in merging. The assumption behind the proposed implementation of the region merging and arrangement strategy is that, local dependencies among the pixels are more important than global dependencies. The obtained feature sets are then arranged in the decreasing order of discriminating capability using a criterion function, which is the ratio of the between class variance to the within class variance of the sample set, in the PCA domain. The results indicate high improvement in the classification performance compared to baseline algorithms.
Abstract: In this paper, a method to detect multiple ellipses is presented. The technique is efficient and robust against incomplete ellipses due to partial occlusion, noise or missing edges and outliers. It is an iterative technique that finds and removes the best ellipse until no reasonable ellipse is found. At each run, the best ellipse is extracted from randomly selected edge patches, its fitness calculated and compared to a fitness threshold. RANSAC algorithm is applied as a sampling process together with the Direct Least Square fitting of ellipses (DLS) as the fitting algorithm. In our experiment, the method performs very well and is robust against noise and spurious edges on both synthetic and real-world image data.
Abstract: The goal of the study reported in the paper was to
determine whether Ambient Occlusion Shading (AOS) has a significant effect on users' perception of American Sign Language (ASL) finger spelling animations. Seventy-one (71) subjects
participated in the study; all subjects were fluent in ASL. The participants were asked to watch forty (40) sign language animation
clips representing twenty (20) finger spelled words. Twenty (20) clips did not show ambient occlusion, whereas the other twenty (20) were
rendered using ambient occlusion shading. After viewing each animation, subjects were asked to type the word being finger-spelled and rate its legibility. Findings show that the presence of AOS had a significant effect on the subjects perception of the signed words.
Subjects were able to recognize the animated words rendered with AOS with higher level of accuracy, and the legibility ratings of the animations showing AOS were consistently higher across subjects.
Abstract: This paper presents a sensing system for 3D sensing
and mapping by a tracked mobile robot with an arm-type sensor
movable unit and a laser range finder (LRF). The arm-type sensor
movable unit is mounted on the robot and the LRF is installed at the
end of the unit. This system enables the sensor to change position and
orientation so that it avoids occlusions according to terrain by this
mechanism. This sensing system is also able to change the height of
the LRF by keeping its orientation flat for efficient sensing. In this kind
of mapping, it may be difficult for moving robot to apply mapping
algorithms such as the iterative closest point (ICP) because sets of the
2D data at each sensor height may be distant in a common surface. In
order for this kind of mapping, the authors therefore applied
interpolation to generate plausible model data for ICP. The results of
several experiments provided validity of these kinds of sensing and
mapping in this sensing system.
Abstract: This paper aims to propose a novel, robust, and simple method for obtaining a human 3D face model and camera pose (position and orientation) from a video sequence. Given a video sequence of a face recorded from an off-the-shelf digital camera, feature points used to define facial parts are tracked using the Active- Appearance Model (AAM). Then, the face-s 3D structure and camera pose of each video frame can be simultaneously calculated from the obtained point correspondences. This proposed method is primarily based on the combined approaches of Gradient Descent and Powell-s Multidimensional Minimization. Using this proposed method, temporarily occluded point including the case of self-occlusion does not pose a problem. As long as the point correspondences displayed in the video sequence have enough parallax, these missing points can still be reconstructed.
Abstract: In this work a dual laser triangulation system is presented for fast building of 2.5D textured models of objects within a production line. This scanner is designed to produce data suitable for 3D completeness inspection algorithms. For this purpose two laser projectors have been used in order to considerably reduce the problem of occlusions in the camera movement direction. Results of reconstruction of electronic boards are presented, together with a comparison with a commercial system.
Abstract: Real-time hand tracking is a challenging task in many
computer vision applications such as gesture recognition. This paper
proposes a robust method for hand tracking in a complex environment
using Mean-shift analysis and Kalman filter in conjunction with 3D
depth map. The depth information solve the overlapping problem
between hands and face, which is obtained by passive stereo measuring
based on cross correlation and the known calibration data of
the cameras. Mean-shift analysis uses the gradient of Bhattacharyya
coefficient as a similarity function to derive the candidate of the hand
that is most similar to a given hand target model. And then, Kalman
filter is used to estimate the position of the hand target. The results
of hand tracking, tested on various video sequences, are robust to
changes in shape as well as partial occlusion.
Abstract: In this paper, we propose a robust disease detection
method, called adaptive orientation code matching (Adaptive OCM),
which is developed from a robust image registration algorithm:
orientation code matching (OCM), to achieve continuous and
site-specific detection of changes in plant disease. We use two-stage
framework for realizing our research purpose; in the first stage,
adaptive OCM was employed which could not only realize the
continuous and site-specific observation of disease development, but
also shows its excellent robustness for non-rigid plant object searching
in scene illumination, translation, small rotation and occlusion changes
and then in the second stage, a machine learning method of support
vector machine (SVM) based on a feature of two dimensional (2D)
xy-color histogram is further utilized for pixel-wise disease
classification and quantification. The indoor experiment results
demonstrate the feasibility and potential of our proposed algorithm,
which could be implemented in real field situation for better
observation of plant disease development.
Abstract: Region covariance (RC) descriptor is an effective
and efficient feature for visual tracking. Current RC-based tracking
algorithms use the whole RC matrix to track the target in video
directly. However, there exist some issues for these whole RCbased
algorithms. If some features are contaminated, the whole RC
will become unreliable, which results in lost object-tracking. In
addition, if some features are very discriminative to the
background, other features are still processed and thus reduce the
efficiency. In this paper a new robust tracking method is proposed,
in which the whole RC matrix is decomposed into several low rank
matrices. Those matrices are dynamically chosen and processed so
as to achieve a good tradeoff between discriminability and
complexity. Experimental results have shown that our method is
more robust to complex environment changes, especially either
when occlusion happens or when the background is similar to the
target compared to other RC-based methods.
Abstract: This paper presents a novel approach for representing
the spatio-temporal topology of the camera network with overlapping
and non-overlapping fields of view (FOVs). The topology is
determined by tracking moving objects and establishing object
correspondence across multiple cameras. To track people successfully
in multiple camera views, we used the Merge-Split (MS) approach for
object occlusion in a single camera and the grid-based approach for
extracting the accurate object feature. In addition, we considered the
appearance of people and the transition time between entry and exit
zones for tracking objects across blind regions of multiple cameras
with non-overlapping FOVs. The main contribution of this paper is to
estimate transition times between various entry and exit zones, and to
graphically represent the camera topology as an undirected weighted
graph using the transition probabilities.
Abstract: This paper presents a novel template-based method to
detect objects of interest from real images by shape matching. To
locate a target object that has a similar shape to a given template
boundary, the proposed method integrates three components: contour
grouping, partial shape matching, and boundary verification. In the
first component, low-level image features, including edges and
corners, are grouped into a set of perceptually salient closed contours
using an extended ratio-contour algorithm. In the second component,
we develop a partial shape matching algorithm to identify the
fractions of detected contours that partly match given template
boundaries. Specifically, we represent template boundaries and
detected contours using landmarks, and apply a greedy algorithm to
search the matched landmark subsequences. For each matched
fraction between a template and a detected contour, we estimate an
affine transform that transforms the whole template into a hypothetic
boundary. In the third component, we provide an efficient algorithm
based on oriented edge lists to determine the target boundary from
the hypothetic boundaries by checking each of them against image
edges. We evaluate the proposed method on recognizing and
localizing 12 template leaves in a data set of real images with clutter
back-grounds, illumination variations, occlusions, and image noises.
The experiments demonstrate the high performance of our proposed
method1.
Abstract: In recent years, we see an increase of interest for efficient tracking systems in surveillance applications. Many of the proposed techniques are designed for static cameras environments. When the camera is moving, tracking moving objects become more difficult and many techniques fail to detect and track the desired targets. The problem becomes more complex when we want to track a specific object in real-time using a moving Pan and Tilt camera system to keep the target within the image. This type of tracking is of high importance in surveillance applications. When a target is detected at a certain zone, the possibility of automatically tracking it continuously and keeping it within the image until action is taken is very important for security personnel working in very sensitive sites. This work presents a real-time tracking system permitting the detection and continuous tracking of targets using a Pan and Tilt camera platform. A novel and efficient approach for dealing with occlusions is presented. Also a new intelligent forget factor is introduced in order to take into account target shape variations and avoid learning non desired objects. Tests conducted in outdoor operational scenarios show the efficiency and robustness of the proposed approach.
Abstract: Real-time 3D applications have to guarantee
interactive rendering speed. There is a restriction for the number of
polygons which is rendered due to performance of a graphics hardware
or graphics algorithms. Generally, the rendering performance will be
drastically increased when handling only the dynamic 3d models,
which is much fewer than the static ones. Since shapes and colors of
the static objects don-t change when the viewing direction is fixed, the
information can be reused. We render huge amounts of polygon those
cannot handled by conventional rendering techniques in real-time by
using a static object image and merging it with rendering result of the
dynamic objects. The performance must be decreased as a
consequence of updating the static object image including removing
an static object that starts to move, re-rending the other static objects
being overlapped by the moving ones. Based on visibility of the object
beginning to move, we can skip the updating process. As a result, we
enhance rendering performance and reduce differences of rendering
speed between each frame. Proposed method renders total
200,000,000 polygons that consist of 500,000 dynamic polygons and
the rest are static polygons in about 100 frames per second.
Abstract: CFD simulations are carried out in arterial stenoses
with 48 % areal occlusion. Non-newtonian fluid model is selected for
the blood flow as the same problem has been solved before with
Newtonian fluid model. Studies on flow resistance with the presence
of surface irregularities are carried out. Investigations are also
performed on the pressure drop at various Reynolds numbers. The
present study revealed that the pressure drop across a stenosed artery
is practically unaffected by surface irregularities at low Reynolds
numbers, while flow features are observed and discussed at higher
Reynolds numbers.
Abstract: Cameras are often mounted on platforms that canmove like rovers, booms, gantries and aircraft. People operate suchplatforms to capture desired views of scene or target. To avoidcollisions with the environment and occlusions, such platforms oftenpossess redundant degrees-of-freedom. As a result, manipulatingsuch platforms demands much skill. Visual-servoing some degrees-of-freedom may reduce operator burden and improve tracking per-formance. This concept, which we call human-in-the-loop visual-servoing, is demonstrated in this paper and applies a Α-β-γ filter and feedforward controller to a broadcast camera boom.
Abstract: This paper presents recent work on the improvement
of the robotics vision based control strategy for underwater pipeline
tracking system. The study focuses on developing image processing
algorithms and a fuzzy inference system for the analysis of the
terrain. The main goal is to implement the supervisory fuzzy learning
control technique to reduce the errors on navigation decision due to
the pipeline occlusion problem. The system developed is capable of
interpreting underwater images containing occluded pipeline, seabed
and other unwanted noise. The algorithm proposed in previous work
does not explore the cooperation between fuzzy controllers,
knowledge and learnt data to improve the outputs for underwater
pipeline tracking. Computer simulations and prototype simulations
demonstrate the effectiveness of this approach. The system accuracy
level has also been discussed.
Abstract: One of the major, difficult tasks in automated video
surveillance is the segmentation of relevant objects in the scene.
Current implementations often yield inconsistent results on average
from frame to frame when trying to differentiate partly occluding
objects. This paper presents an efficient block-based segmentation
algorithm which is capable of separating partly occluding objects and
detecting shadows. It has been proven to perform in real time with a
maximum duration of 47.48 ms per frame (for 8x8 blocks on a
720x576 image) with a true positive rate of 89.2%. The flexible
structure of the algorithm enables adaptations and improvements with
little effort. Most of the parameters correspond to relative differences
between quantities extracted from the image and should therefore not
depend on scene and lighting conditions. Thus presenting a
performance oriented segmentation algorithm which is applicable in
all critical real time scenarios.
Abstract: The Continuously Adaptive Mean-Shift (CamShift)
algorithm, incorporating scene depth information is combined with
the l1-minimization sparse representation based method to form a
hybrid kernel and state space-based tracking algorithm. We take
advantage of the increased efficiency of the former with the
robustness to occlusion property of the latter. A simple interchange
scheme transfers control between algorithms based upon drift and
occlusion likelihood. It is quantified by the projection of target
candidates onto a depth map of the 2D scene obtained with a low cost
stereo vision webcam. Results are improved tracking in terms of drift
over each algorithm individually, in a challenging practical outdoor
multiple occlusion test case.