Abstract: The large pose discrepancy is one of the critical
challenges in face recognition during video surveillance. Due to
the entanglement of pose attributes with identity information, the
conventional approaches for pose-independent representation lack
in providing quality results in recognizing largely posed faces. In
this paper, we propose a practical approach to disentangle the pose
attribute from the identity information followed by synthesis of a face
using a classifier network in latent space. The proposed approach
employs a modified generative adversarial network framework
consisting of an encoder-decoder structure embedded with a classifier
in manifold space for carrying out factorization on the latent
encoding. It can be further generalized to other face and non-face
attributes for real-life video frames containing faces with significant
attribute variations. Experimental results and comparison with state
of the art in the field prove that the learned representation of the
proposed approach synthesizes more compelling perceptual images
through a combination of adversarial and classification losses.
Abstract: Due to the sensor technology, video surveillance has become the main way for security control in every big city in the world. Surveillance is usually used by governments for intelligence gathering, the prevention of crime, the protection of a process, person, group or object, or the investigation of crime. Many surveillance systems based on computer vision technology have been developed in recent years. Moving target tracking is the most common task for Unmanned Aerial Vehicle (UAV) to find and track objects of interest in mobile aerial surveillance for civilian applications. The paper is focused on vision-based collision avoidance for UAVs by recurrent neural networks. First, images from cameras on UAV were fused based on deep convolutional neural network. Then, a recurrent neural network was constructed to obtain high-level image features for object tracking and extracting low-level image features for noise reducing. The system distributed the calculation of the whole system to local and cloud platform to efficiently perform object detection, tracking and collision avoidance based on multiple UAVs. The experiments on several challenging datasets showed that the proposed algorithm outperforms the state-of-the-art methods.
Abstract: This paper presents a real time video surveillance system which is capable of tracking multiple real time objects using Polar Vector Median (PVM) and Block Coding Modes (BCM) with Global Motion Compensation (GMC). This strategy works in the packed area and furthermore utilizes the movement vectors and BCM from the compressed bit stream to perform real time object tracking. We propose to do this in view of the neighboring Motion Vectors (MVs) using a method called PVM. Since GM adds to the object’s native motion, for accurate tracking, it is important to remove GM from the MV field prior to further processing. The proposed method is tested on a number of standard sequences and the results show its advantages over some of the current modern methods.
Abstract: This research work aims to develop a system that will analyze and identify students who indulge in malpractices/suspicious activities during the course of an academic offline examination. Automated Video Surveillance provides an optimal solution which helps in monitoring the students and identifying the malpractice event immediately. This work is organized into three modules. The first module deals with performing an impersonation check using a PCA-based face recognition method which is done by cross checking his profile with the database. The presence or absence of the student is even determined in this module by implementing an image registration technique wherein a grid is formed by considering all the images registered using the frontal camera at the determined positions. Second, detecting such facial malpractices in which a student gets involved in conversation with another, trying to obtain unauthorized information etc., based on the threshold range evaluated by considering his/her mouth state whether open or closed. The third module deals with identification of unauthorized material or gadgets used in the examination hall by training the positive samples of the object through various stages. Here, a top view camera feed is analyzed to detect the suspicious activities. The system automatically alerts the administration when any suspicious activities are identified, thereby reducing the error rate caused due to manual monitoring. This work is an improvement over our previous work published in identifying suspicious activities done by examinees in an offline examination.
Abstract: In this paper, we present a human behavior modeling approach in videos scenes. This approach is used to model the normal behaviors in the conference halls. We exploited the Probabilistic Latent Semantic Analysis technique (PLSA), using the 'Bag-of-Terms' paradigm, as a tool for exploring video data to learn the model by grouping similar activities. Our term vocabulary consists of 3D spatio-temporal patch groups assigned by the direction of motion. Our video representation ensures the spatial information, the object trajectory, and the motion. The main importance of this approach is that it can be adapted to detect abnormal behaviors in order to ensure and enhance human security.
Abstract: Like a closed-circuit television (CCTV), video surveillance system is widely placed for gathering video from unspecified people to prevent crime, surveillance, or many other purposes. However, abuse of CCTV brings about concerns of personal privacy invasions. In this paper, we propose an encryption method to protect personal privacy system in H.264 compressed video bitstream with encrypting only regions of interest (ROI). There is no need to change the existing video surveillance system. In addition, encrypting ROI in compressed video bitstream is a challenging work due to spatial and temporal drift errors. For this reason, we propose a novel drift mitigation method when ROI is encrypted. The proposed method was implemented by using JM reference software based on the H.264 compressed videos, and experimental results show the verification of our proposed methods and its effectiveness.
Abstract: Fire-related incidents account for extensive loss of life and
material damage. Quick and reliable detection of occurring fires has high
real world implications. Whereas a major research focus lies on the detection
of outdoor fires, indoor camera-based fire detection is still an open issue.
Cameras in combination with computer vision helps to detect flames and
smoke more quickly than conventional fire detectors. In this work, we present
a computer vision-based smoke detection algorithm based on contrast changes
and a multi-step classification. This work accelerates computer vision-based
fire detection considerably in comparison with classical indoor-fire detection.
Abstract: Human skin detection recognized as the primary step in most of the applications such as face detection, illicit image filtering, hand recognition and video surveillance. The performance of any skin detection applications greatly relies on the two components: feature extraction and classification method. Skin color is the most vital information used for skin detection purpose. However, color feature alone sometimes could not handle images with having same color distribution with skin color. A color feature of pixel-based does not eliminate the skin-like color due to the intensity of skin and skin-like color fall under the same distribution. Hence, the statistical color analysis will be exploited such mean and standard deviation as an additional feature to increase the reliability of skin detector. In this paper, we studied the effectiveness of statistical color feature for human skin detection. Furthermore, the paper analyzed the integrated color and texture using eight classifiers with three color spaces of RGB, YCbCr, and HSV. The experimental results show that the integrating statistical feature using Random Forest classifier achieved a significant performance with an F1-score 0.969.
Abstract: Advance in techniques of image and video processing has enabled the development of intelligent video surveillance systems. This study was aimed to automatically detect moving human objects and to analyze events of dual human interaction in a surveillance scene. Our system was developed in four major steps: image preprocessing, human object detection, human object tracking, and motion trajectory analysis. The adaptive background subtraction and image processing techniques were used to detect and track moving human objects. To solve the occlusion problem during the interaction, the Kalman filter was used to retain a complete trajectory for each human object. Finally, the motion trajectory analysis was developed to distinguish between the interaction and non-interaction events based on derivatives of trajectories related to the speed of the moving objects. Using a database of 60 video sequences, our system could achieve the classification accuracy of 80% in interaction events and 95% in non-interaction events, respectively. In summary, we have explored the idea to investigate a system for the automatic classification of events for interaction and non-interaction events using surveillance cameras. Ultimately, this system could be incorporated in an intelligent surveillance system for the detection and/or classification of abnormal or criminal events (e.g., theft, snatch, fighting, etc.).
Abstract: The detection of moving objects from a video image
sequences is very important for object tracking, activity recognition,
and behavior understanding in video surveillance.
The most used approach for moving objects detection / tracking is
background subtraction algorithms. Many approaches have been
suggested for background subtraction. But, these are illumination
change sensitive and the solutions proposed to bypass this problem
are time consuming.
In this paper, we propose a robust yet computationally efficient
background subtraction approach and, mainly, focus on the ability to
detect moving objects on dynamic scenes, for possible applications in
complex and restricted access areas monitoring, where moving and
motionless persons must be reliably detected. It consists of three
main phases, establishing illumination changes invariance,
background/foreground modeling and morphological analysis for
noise removing.
We handle illumination changes using Contrast Limited Histogram
Equalization (CLAHE), which limits the intensity of each pixel to
user determined maximum. Thus, it mitigates the degradation due to
scene illumination changes and improves the visibility of the video
signal. Initially, the background and foreground images are extracted
from the video sequence. Then, the background and foreground
images are separately enhanced by applying CLAHE.
In order to form multi-modal backgrounds we model each channel
of a pixel as a mixture of K Gaussians (K=5) using Gaussian Mixture
Model (GMM). Finally, we post process the resulting binary
foreground mask using morphological erosion and dilation
transformations to remove possible noise.
For experimental test, we used a standard dataset to challenge the
efficiency and accuracy of the proposed method on a diverse set of
dynamic scenes.
Abstract: Real time image and video processing is a demand in
many computer vision applications, e.g. video surveillance, traffic
management and medical imaging. The processing of those video
applications requires high computational power. Thus, the optimal
solution is the collaboration of CPU and hardware accelerators. In
this paper, a Canny edge detection hardware accelerator is proposed.
Edge detection is one of the basic building blocks of video and image
processing applications. It is a common block in the pre-processing
phase of image and video processing pipeline. Our presented
approach targets offloading the Canny edge detection algorithm from
processing system (PS) to programmable logic (PL) taking the
advantage of High Level Synthesis (HLS) tool flow to accelerate the
implementation on Zynq platform. The resulting implementation
enables up to a 100x performance improvement through hardware
acceleration. The CPU utilization drops down and the frame rate
jumps to 60 fps of 1080p full HD input video stream.
Abstract: Enhancing the quality of two dimensional signals is one of the most important factors in the fields of video surveillance and computer vision. Usually in real-life video surveillance, false detection occurs due to the presence of random noise, illumination
and shadow artifacts. The detection methods based on background subtraction faces several problems in accurately detecting objects in realistic environments: In this paper, we propose a noise removal algorithm using neighborhood comparison method with thresholding. The illumination variations correction is done in the detected foreground objects by using an amalgamation of techniques like homomorphic decomposition, curvelet transformation and gamma adjustment operator. Shadow is removed using chromaticity estimator with local relation estimator. Results are compared with the existing methods and prove as high robustness in the video surveillance.
Abstract: In this paper, a simple moving human detection method is proposed for video surveillance system or access monitoring system. The frame difference and noise threshold are used for initial detection of a moving human-object, and simple labeling method is applied for final human-object segmentation. The simulated results show that the applied algorithm is fast to detect the moving human-objects by performing 95% of correct detection rate. The proposed algorithm has confirmed that can be used as an intelligent video access monitoring system.
Abstract: In this paper, the detection and tracking of face, mouth, hands and medication bottles in the context of medication intake monitoring with a camera is presented. This is aimed at recognizing medication intake for elderly in their home setting to avoid an inappropriate use. Background subtraction is used to isolate moving objects, and then, skin and bottle segmentations are done in the RGB normalized color space. We use a minimum displacement distance criterion to track skin color regions and the R/G ratio to detect the mouth. The color-labeled medication bottles are simply tracked based on the color space distance to their mean color vector. For the recognition of medication intake, we propose a three-level hierarchal approach, which uses activity-patterns to recognize the normal medication intake activity. The proposed method was tested with three persons, with different medication intake scenarios, and gave an overall precision of over 98%.
Abstract: People detection from images has a variety of applications such as video surveillance and driver assistance system, but is still a challenging task and more difficult in crowded environments such as shopping malls in which occlusion of lower parts of human body often occurs. Lack of the full-body information requires more effective features than common features such as HOG. In this paper, new features are introduced that exploits global self-symmetry (GSS) characteristic in head-shoulder patterns. The features encode the similarity or difference of color histograms and oriented gradient histograms between two vertically symmetric blocks. The domain-specific features are rapid to compute from the integral images in Viola-Jones cascade-of-rejecters framework. The proposed features are evaluated with our own head-shoulder dataset that, in part, consists of a well-known INRIA pedestrian dataset. Experimental results show that the GSS features are effective in reduction of false alarmsmarginally and the gradient GSS features are preferred more often than the color GSS ones in the feature selection.
Abstract: Counting people from a video stream in a noisy environment is a challenging task. This project aims at developing a counting system for transport vehicles, integrated in a video surveillance product. This article presents a method for the detection and tracking of multiple faces in a video by using a model of first and second order local moments. An iterative process is used to estimate the position and shape of multiple faces in images, and to track them. the trajectories are then processed to count people entering and leaving the vehicle.
Abstract: This paper presents a highly efficient algorithm for detecting and tracking humans and objects in video surveillance sequences. Mean shift clustering is applied on backgrounddifferenced image sequences. For efficiency, all calculations are performed on integral images. Novel corresponding exponential integral kernels are introduced to allow the application of nonuniform kernels for clustering, which dramatically increases robustness without giving up the efficiency of the integral data structures. Experimental results demonstrating the power of this approach are presented.
Abstract: Robustness is one of the primary performance criteria for an Intelligent Video Surveillance (IVS) system. One of the key factors in enhancing the robustness of dynamic video analysis is,providing accurate and reliable means for shadow detection. If left undetected, shadow pixels may result in incorrect object tracking and classification, as it tends to distort localization and measurement information. Most of the algorithms proposed in literature are computationally expensive; some to the extent of equalling computational requirement of motion detection. In this paper, the homogeneity property of shadows is explored in a novel way for shadow detection. An adaptive division image (which highlights homogeneity property of shadows) analysis followed by a relatively simpler projection histogram analysis for penumbra suppression is the key novelty in our approach.
Abstract: Human activity is a major concern in a wide variety of
applications, such as video surveillance, human computer interface
and face image database management. Detecting and recognizing
faces is a crucial step in these applications. Furthermore, major
advancements and initiatives in security applications in the past years
have propelled face recognition technology into the spotlight. The
performance of existing face recognition systems declines significantly
if the resolution of the face image falls below a certain level.
This is especially critical in surveillance imagery where often, due to
many reasons, only low-resolution video of faces is available. If these
low-resolution images are passed to a face recognition system, the
performance is usually unacceptable. Hence, resolution plays a key
role in face recognition systems. In this paper we introduce a new
low resolution face recognition system based on mixture of expert
neural networks. In order to produce the low resolution input images
we down-sampled the 48 × 48 ORL images to 12 × 12 ones using
the nearest neighbor interpolation method and after that applying
the bicubic interpolation method yields enhanced images which is
given to the Principal Component Analysis feature extractor system.
Comparison with some of the most related methods indicates that
the proposed novel model yields excellent recognition rate in low
resolution face recognition that is the recognition rate of 100% for
the training set and 96.5% for the test set.
Abstract: Shadow detection is still considered as one of the
potential challenges for intelligent automated video surveillance
systems. A pre requisite for reliable and accurate detection and
tracking is the correct shadow detection and classification. In such a
landscape of conditions, privacy issues add more and more
complexity and require reliable shadow detection.
In this work the intertwining between security, accuracy,
reliability and privacy is analyzed and, accordingly, a novel
architecture for Privacy Enhancing Video Surveillance (PEVS) is
introduced. Shadow detection and masking are dealt with through the
combination of two different approaches simultaneously. This results
in a unique privacy enhancement, without affecting security.
Subsequently, the methodology was employed successfully in a
large-scale wireless video surveillance system; privacy relevant
information was stored and encrypted on the unit, without
transferring it over an un-trusted network.