Fast and Robust Long-term Tracking with Effective Searching Model

Kernelized Correlation Filter (KCF) based trackers have gained a lot of attention recently because of their accuracy and fast calculation speed. However, this algorithm is not robust in cases where the object is lost by a sudden change of direction, being obscured or going out of view. In order to improve KCF performance in long-term tracking, this paper proposes an anomaly detection method for target loss warning by analyzing the response map of each frame, and a classification algorithm for reliable target re-locating mechanism by using Random fern. Being tested with Visual Tracker Benchmark and Visual Object Tracking datasets, the experimental results indicated that the precision and success rate of the proposed algorithm were 2.92 and 2.61 times higher than that of the original KCF algorithm, respectively. Moreover, the proposed tracker handles occlusion better than many state-of-the-art long-term tracking methods while running at 60 frames per second.

Online Pose Estimation and Tracking Approach with Siamese Region Proposal Network

Human pose estimation and tracking are to accurately identify and locate the positions of human joints in the video. It is a computer vision task which is of great significance for human motion recognition, behavior understanding and scene analysis. There has been remarkable progress on human pose estimation in recent years. However, more researches are needed for human pose tracking especially for online tracking. In this paper, a framework, called PoseSRPN, is proposed for online single-person pose estimation and tracking. We use Siamese network attaching a pose estimation branch to incorporate Single-person Pose Tracking (SPT) and Visual Object Tracking (VOT) into one framework. The pose estimation branch has a simple network structure that replaces the complex upsampling and convolution network structure with deconvolution. By augmenting the loss of fully convolutional Siamese network with the pose estimation task, pose estimation and tracking can be trained in one stage. Once trained, PoseSRPN only relies on a single bounding box initialization and producing human joints location. The experimental results show that while maintaining the good accuracy of pose estimation on COCO and PoseTrack datasets, the proposed method achieves a speed of 59 frame/s, which is superior to other pose tracking frameworks.

Vision Based People Tracking System

In this paper we present the design and the implementation of a target tracking system where the target is set to be a moving person in a video sequence. The system can be applied easily as a vision system for mobile robot. The system is composed of two major parts the first is the detection of the person in the video frame using the SVM learning machine based on the “HOG” descriptors. The second part is the tracking of a moving person it’s done by using a combination of the Kalman filter and a modified version of the Camshift tracking algorithm by adding the target motion feature to the color feature, the experimental results had shown that the new algorithm had overcame the traditional Camshift algorithm in robustness and in case of occlusion.

Vision-Based Collision Avoidance for Unmanned Aerial Vehicles by Recurrent Neural Networks

Due to the sensor technology, video surveillance has become the main way for security control in every big city in the world. Surveillance is usually used by governments for intelligence gathering, the prevention of crime, the protection of a process, person, group or object, or the investigation of crime. Many surveillance systems based on computer vision technology have been developed in recent years. Moving target tracking is the most common task for Unmanned Aerial Vehicle (UAV) to find and track objects of interest in mobile aerial surveillance for civilian applications. The paper is focused on vision-based collision avoidance for UAVs by recurrent neural networks. First, images from cameras on UAV were fused based on deep convolutional neural network. Then, a recurrent neural network was constructed to obtain high-level image features for object tracking and extracting low-level image features for noise reducing. The system distributed the calculation of the whole system to local and cloud platform to efficiently perform object detection, tracking and collision avoidance based on multiple UAVs. The experiments on several challenging datasets showed that the proposed algorithm outperforms the state-of-the-art methods.

Real Time Object Tracking in H.264/ AVC Using Polar Vector Median and Block Coding Modes

This paper presents a real time video surveillance system which is capable of tracking multiple real time objects using Polar Vector Median (PVM) and Block Coding Modes (BCM) with Global Motion Compensation (GMC). This strategy works in the packed area and furthermore utilizes the movement vectors and BCM from the compressed bit stream to perform real time object tracking. We propose to do this in view of the neighboring Motion Vectors (MVs) using a method called PVM. Since GM adds to the object’s native motion, for accurate tracking, it is important to remove GM from the MV field prior to further processing. The proposed method is tested on a number of standard sequences and the results show its advantages over some of the current modern methods.

Smart Side View Mirror Camera for Real Time System

In the last decade, automotive companies have invested a lot in terms of innovation about many aspects regarding the automatic driver assistance systems. One innovation regards the usage of a smart camera placed on the car’s side mirror for monitoring the back and lateral road situation. A common road scenario is the overtaking of the preceding car and, in this case, a brief distraction or a loss of concentration can lead the driver to undertake this action, even if there is an already overtaking vehicle, leading to serious accidents. A valid support for a secure drive can be a smart camera system, which is able to automatically analyze the road scenario and consequentially to warn the driver when another vehicle is overtaking. This paper describes a method for monitoring the side view of a vehicle by using camera optical flow motion vectors. The proposed solution detects the presence of incoming vehicles, assesses their distance from the host car, and warns the driver through different levels of alert according to the estimated distance. Due to the low complexity and computational cost, the proposed system ensures real time performances.

Object Tracking in Motion Blurred Images with Adaptive Mean Shift and Wavelet Feature

A method for object tracking in motion blurred images is proposed in this article. This paper shows that object tracking could be improved with this approach. We use mean shift algorithm to track different objects as a main tracker. But, the problem is that mean shift could not track the selected object accurately in blurred scenes. So, for better tracking result, and increasing the accuracy of tracking, wavelet transform is used. We use a feature named as blur extent, which could help us to get better results in tracking. For calculating of this feature, we should use Harr wavelet. We can look at this matter from two different angles which lead to determine whether an image is blurred or not and to what extent an image is blur. In fact, this feature left an impact on the covariance matrix of mean shift algorithm and cause to better performance of tracking. This method has been concentrated mostly on motion blur parameter. transform. The results reveal the ability of our method in order to reach more accurately tracking.

Automatic Motion Trajectory Analysis for Dual Human Interaction Using Video Sequences

Advance in techniques of image and video processing has enabled the development of intelligent video surveillance systems. This study was aimed to automatically detect moving human objects and to analyze events of dual human interaction in a surveillance scene. Our system was developed in four major steps: image preprocessing, human object detection, human object tracking, and motion trajectory analysis. The adaptive background subtraction and image processing techniques were used to detect and track moving human objects. To solve the occlusion problem during the interaction, the Kalman filter was used to retain a complete trajectory for each human object. Finally, the motion trajectory analysis was developed to distinguish between the interaction and non-interaction events based on derivatives of trajectories related to the speed of the moving objects. Using a database of 60 video sequences, our system could achieve the classification accuracy of 80% in interaction events and 95% in non-interaction events, respectively. In summary, we have explored the idea to investigate a system for the automatic classification of events for interaction and non-interaction events using surveillance cameras. Ultimately, this system could be incorporated in an intelligent surveillance system for the detection and/or classification of abnormal or criminal events (e.g., theft, snatch, fighting, etc.). 

Toward Indoor and Outdoor Surveillance Using an Improved Fast Background Subtraction Algorithm

The detection of moving objects from a video image sequences is very important for object tracking, activity recognition, and behavior understanding in video surveillance. The most used approach for moving objects detection / tracking is background subtraction algorithms. Many approaches have been suggested for background subtraction. But, these are illumination change sensitive and the solutions proposed to bypass this problem are time consuming. In this paper, we propose a robust yet computationally efficient background subtraction approach and, mainly, focus on the ability to detect moving objects on dynamic scenes, for possible applications in complex and restricted access areas monitoring, where moving and motionless persons must be reliably detected. It consists of three main phases, establishing illumination changes invariance, background/foreground modeling and morphological analysis for noise removing. We handle illumination changes using Contrast Limited Histogram Equalization (CLAHE), which limits the intensity of each pixel to user determined maximum. Thus, it mitigates the degradation due to scene illumination changes and improves the visibility of the video signal. Initially, the background and foreground images are extracted from the video sequence. Then, the background and foreground images are separately enhanced by applying CLAHE. In order to form multi-modal backgrounds we model each channel of a pixel as a mixture of K Gaussians (K=5) using Gaussian Mixture Model (GMM). Finally, we post process the resulting binary foreground mask using morphological erosion and dilation transformations to remove possible noise. For experimental test, we used a standard dataset to challenge the efficiency and accuracy of the proposed method on a diverse set of dynamic scenes.

Stereo Motion Tracking

Motion Tracking and Stereo Vision are complicated, albeit well-understood problems in computer vision. Existing softwares that combine the two approaches to perform stereo motion tracking typically employ complicated and computationally expensive procedures. The purpose of this study is to create a simple and effective solution capable of combining the two approaches. The study aims to explore a strategy to combine the two techniques of two-dimensional motion tracking using Kalman Filter; and depth detection of object using Stereo Vision. In conventional approaches objects in the scene of interest are observed using a single camera. However for Stereo Motion Tracking; the scene of interest is observed using video feeds from two calibrated cameras. Using two simultaneous measurements from the two cameras a calculation for the depth of the object from the plane containing the cameras is made. The approach attempts to capture the entire three-dimensional spatial information of each object at the scene and represent it through a software estimator object. In discrete intervals, the estimator tracks object motion in the plane parallel to plane containing cameras and updates the perpendicular distance value of the object from the plane containing the cameras as depth. The ability to efficiently track the motion of objects in three-dimensional space using a simplified approach could prove to be an indispensable tool in a variety of surveillance scenarios. The approach may find application from high security surveillance scenes such as premises of bank vaults, prisons or other detention facilities; to low cost applications in supermarkets and car parking lots.

Visual Object Tracking in 3D with Color Based Particle Filter

This paper addresses the problem of determining the current 3D location of a moving object and robustly tracking it from a sequence of camera images. The approach presented here uses a particle filter and does not perform any explicit triangulation. Only the color of the object to be tracked is required, but not any precisemotion model. The observation model we have developed avoids the color filtering of the entire image. That and the Monte Carlotechniques inside the particle filter provide real time performance.Experiments with two real cameras are presented and lessons learned are commented. The approach scales easily to more than two cameras and new sensor cues.

Object Tracking System Using Camshift, Meanshift and Kalman Filter

This paper presents a implementation of an object tracking system in a video sequence. This object tracking is an important task in many vision applications. The main steps in video analysis are two: detection of interesting moving objects and tracking of such objects from frame to frame. In a similar vein, most tracking algorithms use pre-specified methods for preprocessing. In our work, we have implemented several object tracking algorithms (Meanshift, Camshift, Kalman filter) with different preprocessing methods. Then, we have evaluated the performance of these algorithms for different video sequences. The obtained results have shown good performances according to the degree of applicability and evaluation criteria.

Visual Object Tracking and Interception in Industrial Settings

This paper presents a solution for a robotic manipulation problem. We formulate the problem as combining target identification, tracking and interception. The task in our solution is sensing a target on a conveyor belt and then intercepting robot-s end-effector at a convenient rendezvous point. We used an object recognition method which identifies the target and finds its position from visualized scene picture, then the robot system generates a solution for rendezvous problem using the target-s initial position and belt velocity . The interception of the target and the end-effector is executed at a convenient rendezvous point along the target-s calculated trajectory. Experimental results are obtained using a real platform with an industrial robot and a vision system over it.

Detection of Moving Images Using Neural Network

Motion detection is a basic operation in the selection of significant segments of the video signals. For an effective Human Computer Intelligent Interaction, the computer needs to recognize the motion and track the moving object. Here an efficient neural network system is proposed for motion detection from the static background. This method mainly consists of four parts like Frame Separation, Rough Motion Detection, Network Formation and Training, Object Tracking. This paper can be used to verify real time detections in such a way that it can be used in defense applications, bio-medical applications and robotics. This can also be used for obtaining detection information related to the size, location and direction of motion of moving objects for assessment purposes. The time taken for video tracking by this Neural Network is only few seconds.

Practical Issues for Real-Time Video Tracking

In this paper we present the algorithm which allows us to have an object tracking close to real time in Full HD videos. The frame rate (FR) of a video stream is considered to be between 5 and 30 frames per second. The real time track building will be achieved if the algorithm can follow 5 or more frames per second. The principle idea is to use fast algorithms when doing preprocessing to obtain the key points and track them after. The procedure of matching points during assignment is hardly dependent on the number of points. Because of this we have to limit pointed number of points using the most informative of them.

Multiple Object Tracking using Particle Swarm Optimization

This paper presents a particle swarm optimization (PSO) based approach for multiple object tracking based on histogram matching. To start with, gray-level histograms are calculated to establish a feature model for each of the target object. The difference between the gray-level histogram corresponding to each particle in the search space and the target object is used as the fitness value. Multiple swarms are created depending on the number of the target objects under tracking. Because of the efficiency and simplicity of the PSO algorithm for global optimization, target objects can be tracked as iterations continue. Experimental results confirm that the proposed PSO algorithm can rapidly converge, allowing real-time tracking of each target object. When the objects being tracked move outside the tracking range, global search capability of the PSO resumes to re-trace the target objects.

A Robust Method for Hand Tracking Using Mean-shift Algorithm and Kalman Filter in Stereo Color Image Sequences

Real-time hand tracking is a challenging task in many computer vision applications such as gesture recognition. This paper proposes a robust method for hand tracking in a complex environment using Mean-shift analysis and Kalman filter in conjunction with 3D depth map. The depth information solve the overlapping problem between hands and face, which is obtained by passive stereo measuring based on cross correlation and the known calibration data of the cameras. Mean-shift analysis uses the gradient of Bhattacharyya coefficient as a similarity function to derive the candidate of the hand that is most similar to a given hand target model. And then, Kalman filter is used to estimate the position of the hand target. The results of hand tracking, tested on various video sequences, are robust to changes in shape as well as partial occlusion.

Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

An important problem in speech research is the automatic extraction of information about the shape and dimensions of the vocal tract during real-time speech production. We have previously developed Southampton dynamic magnetic resonance imaging (SDMRI) as an approach to the solution of this problem.However, the SDMRI images are very noisy so that shape extraction is a major challenge. In this paper, we address the problem of tongue shape extraction, which poses difficulties because this is a highly deforming non-parametric shape. We show that combining active shape models with the dynamic Hough transform allows the tongue shape to be reliably tracked in the image sequence.

Scene Adaptive Shadow Detection Algorithm

Robustness is one of the primary performance criteria for an Intelligent Video Surveillance (IVS) system. One of the key factors in enhancing the robustness of dynamic video analysis is,providing accurate and reliable means for shadow detection. If left undetected, shadow pixels may result in incorrect object tracking and classification, as it tends to distort localization and measurement information. Most of the algorithms proposed in literature are computationally expensive; some to the extent of equalling computational requirement of motion detection. In this paper, the homogeneity property of shadows is explored in a novel way for shadow detection. An adaptive division image (which highlights homogeneity property of shadows) analysis followed by a relatively simpler projection histogram analysis for penumbra suppression is the key novelty in our approach.

Feature Point Reduction for Video Stabilization

Corner detection and optical flow are common techniques for feature-based video stabilization. However, these algorithms are computationally expensive and should be performed at a reasonable rate. This paper presents an algorithm for discarding irrelevant feature points and maintaining them for future use so as to improve the computational cost. The algorithm starts by initializing a maintained set. The feature points in the maintained set are examined against its accuracy for modeling. Corner detection is required only when the feature points are insufficiently accurate for future modeling. Then, optical flows are computed from the maintained feature points toward the consecutive frame. After that, a motion model is estimated based on the simplified affine motion model and least square method, with outliers belonging to moving objects presented. Studentized residuals are used to eliminate such outliers. The model estimation and elimination processes repeat until no more outliers are identified. Finally, the entire algorithm repeats along the video sequence with the points remaining from the previous iteration used as the maintained set. As a practical application, an efficient video stabilization can be achieved by exploiting the computed motion models. Our study shows that the number of times corner detection needs to perform is greatly reduced, thus significantly improving the computational cost. Moreover, optical flow vectors are computed for only the maintained feature points, not for outliers, thus also reducing the computational cost. In addition, the feature points after reduction can sufficiently be used for background objects tracking as demonstrated in the simple video stabilizer based on our proposed algorithm.