A Review on Image Segmentation Techniques and Performance Measures

Image segmentation is a method to extract regions of interest from an image. It remains a fundamental problem in computer vision. The increasing diversity and the complexity of segmentation algorithms have led us firstly, to make a review and classify segmentation techniques, secondly to identify the most used measures of segmentation performance and thirdly, discuss deeply on segmentation philosophy in order to help the choice of adequate segmentation techniques for some applications. To justify the relevance of our analysis, recent algorithms of segmentation are presented through the proposed classification.

A Motion Dictionary to Real-Time Recognition of Sign Language Alphabet Using Dynamic Time Warping and Artificial Neural Network

Computacional recognition of sign languages aims to allow a greater social and digital inclusion of deaf people through interpretation of their language by computer. This article presents a model of recognition of two of global parameters from sign languages; hand configurations and hand movements. Hand motion is captured through an infrared technology and its joints are built into a virtual three-dimensional space. A Multilayer Perceptron Neural Network (MLP) was used to classify hand configurations and Dynamic Time Warping (DWT) recognizes hand motion. Beyond of the method of sign recognition, we provide a dataset of hand configurations and motion capture built with help of fluent professionals in sign languages. Despite this technology can be used to translate any sign from any signs dictionary, Brazilian Sign Language (Libras) was used as case study. Finally, the model presented in this paper achieved a recognition rate of 80.4%.

MITOS-RCNN: Mitotic Figure Detection in Breast Cancer Histopathology Images Using Region Based Convolutional Neural Networks

Studies estimate that there will be 266,120 new cases of invasive breast cancer and 40,920 breast cancer induced deaths in the year of 2018 alone. Despite the pervasiveness of this affliction, the current process to obtain an accurate breast cancer prognosis is tedious and time consuming. It usually requires a trained pathologist to manually examine histopathological images and identify the features that characterize various cancer severity levels. We propose MITOS-RCNN: a region based convolutional neural network (RCNN) geared for small object detection to accurately grade one of the three factors that characterize tumor belligerence described by the Nottingham Grading System: mitotic count. Other computational approaches to mitotic figure counting and detection do not demonstrate ample recall or precision to be clinically viable. Our models outperformed all previous participants in the ICPR 2012 challenge, the AMIDA 2013 challenge and the MITOS-ATYPIA-14 challenge along with recently published works. Our model achieved an F- measure score of 0.955, a 6.11% improvement in accuracy from the most accurate of the previously proposed models.

Robust Image Registration Based on an Adaptive Normalized Mutual Information Metric

Image registration is an important topic for many imaging systems and computer vision applications. The standard image registration techniques such as Mutual information/ Normalized mutual information -based methods have a limited performance because they do not consider the spatial information or the relationships between the neighbouring pixels or voxels. In addition, the amount of image noise may significantly affect the registration accuracy. Therefore, this paper proposes an efficient method that explicitly considers the relationships between the adjacent pixels, where the gradient information of the reference and scene images is extracted first, and then the cosine similarity of the extracted gradient information is computed and used to improve the accuracy of the standard normalized mutual information measure. Our experimental results on different data types (i.e. CT, MRI and thermal images) show that the proposed method outperforms a number of image registration techniques in terms of the accuracy.

Simplified Mobile AR Platform Design for Augmented Tourism

This study outlines iterations of designing mobile augmented reality (MAR) applications for tourism specific contexts. Using a design based research model, several cycles of development to implementation were analyzed and refined upon with the goal of building a MAR platform that would facilitate the creation of augmented tours and environments by non-technical users. The project took on several stages, and through the process, a simple framework was begun to be established that can inform the design and use of MAR applications for tourism contexts. As a result of these iterations of development, a platform was developed that can allow novice computer users to create augmented tourism environments. This system was able to connect existing tools in widespread use such as Google Forms and connect them to computer vision algorithms needed for more advanced augmented tourism environments. The study concludes with a discussion of this MAR platform and reveals design elements that have implications for tourism contexts. The study also points to future case uses and design approaches for augmented tourism.

Motion-Based Detection and Tracking of Multiple Pedestrians

Tracking of moving people has gained a matter of great importance due to rapid technological advancements in the field of computer vision. The objective of this study is to design a motion based detection and tracking multiple walking pedestrians randomly in different directions. In our proposed method, Gaussian mixture model (GMM) is used to determine moving persons in image sequences. It reacts to changes that take place in the scene like different illumination; moving objects start and stop often, etc. Background noise in the scene is eliminated through applying morphological operations and the motions of tracked people which is determined by using the Kalman filter. The Kalman filter is applied to predict the tracked location in each frame and to determine the likelihood of each detection. We used a benchmark data set for the evaluation based on a side wall stationary camera. The actual scenes from the data set are taken on a street including up to eight people in front of the camera in different two scenes, the duration is 53 and 35 seconds, respectively. In the case of walking pedestrians in close proximity, the proposed method has achieved the detection ratio of 87%, and the tracking ratio is 77 % successfully. When they are deferred from each other, the detection ratio is increased to 90% and the tracking ratio is also increased to 79%.

Burnout Recognition for Call Center Agents by Using Skin Color Detection with Hand Poses

Call centers have been expanding and they have influence on activation in various markets increasingly. A call center’s work is known as one of the most demanding and stressful jobs. In this paper, we propose the fatigue detection system in order to detect burnout of call center agents in the case of a neck pain and upper back pain. Our proposed system is based on the computer vision technique combined skin color detection with the Viola-Jones object detector. To recognize the gesture of hand poses caused by stress sign, the YCbCr color space is used to detect the skin color region including face and hand poses around the area related to neck ache and upper back pain. A cascade of clarifiers by Viola-Jones is used for face recognition to extract from the skin color region. The detection of hand poses is given by the evaluation of neck pain and upper back pain by using skin color detection and face recognition method. The system performance is evaluated using two groups of dataset created in the laboratory to simulate call center environment. Our call center agent burnout detection system has been implemented by using a web camera and has been processed by MATLAB. From the experimental results, our system achieved 96.3% for upper back pain detection and 94.2% for neck pain detection.

Spectrum of Dry Eye Disease in Computer Users of Manipur India

Computer and video display users might complain about Asthenopia, burning, dry eyes etc. The management of dry eyes is often not in the lines of severity. Following systematic evaluation and grading, dry eye disease is one condition that can be practiced at all levels of ophthalmic care. In the present study, different spectrum causing dry eye and prevalence of dry eye disease in computer users of Manipur, India are determined with 600 individuals (300 cases and 300 control). Individuals between 15 and 50 years who used computers for more than 3 hrs a day for 1 year or more were included. Tear break up time (TBUT) and Schirmer’s test were conducted. It shows that 33 (20.4%) out of 164 males and 47 (30.3%) out of 136 females have dry eye. Possible explanation for the observed result is discussed.

K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors

Matching high dimensional features between images is computationally expensive for exhaustive search approaches in computer vision. Although the dimension of the feature can be degraded by simplifying the prior knowledge of homography, matching accuracy may degrade as a tradeoff. In this paper, we present a feature matching method based on k-means algorithm that reduces the matching cost and matches the features between images instead of using a simplified geometric assumption. Experimental results show that the proposed method outperforms the previous linear exhaustive search approaches in terms of the inlier ratio of matched pairs.

Detecting Tomato Flowers in Greenhouses Using Computer Vision

This paper presents an image analysis algorithm to detect and count yellow tomato flowers in a greenhouse with uneven illumination conditions, complex growth conditions and different flower sizes. The algorithm is designed to be employed on a drone that flies in greenhouses to accomplish several tasks such as pollination and yield estimation. Detecting the flowers can provide useful information for the farmer, such as the number of flowers in a row, and the number of flowers that were pollinated since the last visit to the row. The developed algorithm is designed to handle the real world difficulties in a greenhouse which include varying lighting conditions, shadowing, and occlusion, while considering the computational limitations of the simple processor in the drone. The algorithm identifies flowers using an adaptive global threshold, segmentation over the HSV color space, and morphological cues. The adaptive threshold divides the images into darker and lighter images. Then, segmentation on the hue, saturation and volume is performed accordingly, and classification is done according to size and location of the flowers. 1069 images of greenhouse tomato flowers were acquired in a commercial greenhouse in Israel, using two different RGB Cameras – an LG G4 smartphone and a Canon PowerShot A590. The images were acquired from multiple angles and distances and were sampled manually at various periods along the day to obtain varying lighting conditions. Ground truth was created by manually tagging approximately 25,000 individual flowers in the images. Sensitivity analyses on the acquisition angle of the images, periods throughout the day, different cameras and thresholding types were performed. Precision, recall and their derived F1 score were calculated. Results indicate better performance for the view angle facing the flowers than any other angle. Acquiring images in the afternoon resulted with the best precision and recall results. Applying a global adaptive threshold improved the median F1 score by 3%. Results showed no difference between the two cameras used. Using hue values of 0.12-0.18 in the segmentation process provided the best results in precision and recall, and the best F1 score. The precision and recall average for all the images when using these values was 74% and 75% respectively with an F1 score of 0.73. Further analysis showed a 5% increase in precision and recall when analyzing images acquired in the afternoon and from the front viewpoint.

Challenges in Video Based Object Detection in Maritime Scenario Using Computer Vision

This paper discusses the technical challenges in maritime image processing and machine vision problems for video streams generated by cameras. Even well documented problems of horizon detection and registration of frames in a video are very challenging in maritime scenarios. More advanced problems of background subtraction and object detection in video streams are very challenging. Challenges arising from the dynamic nature of the background, unavailability of static cues, presence of small objects at distant backgrounds, illumination effects, all contribute to the challenges as discussed here.

Paddy/Rice Singulation for Determination of Husking Efficiency and Damage Using Machine Vision

In this study a system of machine vision and singulation was developed to separate paddy from rice and determine paddy husking and rice breakage percentages. The machine vision system consists of three main components including an imaging chamber, a digital camera, a computer equipped with image processing software. The singulation device consists of a kernel holding surface, a motor with vacuum fan, and a dimmer. For separation of paddy from rice (in the image), it was necessary to set a threshold. Therefore, some images of paddy and rice were sampled and the RGB values of the images were extracted using MATLAB software. Then mean and standard deviation of the data were determined. An Image processing algorithm was developed using MATLAB to determine paddy/rice separation and rice breakage and paddy husking percentages, using blue to red ratio. Tests showed that, a threshold of 0.75 is suitable for separating paddy from rice kernels. Results from the evaluation of the image processing algorithm showed that the accuracies obtained with the algorithm were 98.36% and 91.81% for paddy husking and rice breakage percentage, respectively. Analysis also showed that a suction of 45 mmHg to 50 mmHg yielding 81.3% separation efficiency is appropriate for operation of the kernel singulation system.

Video Based Ambient Smoke Detection By Detecting Directional Contrast Decrease

Fire-related incidents account for extensive loss of life and material damage. Quick and reliable detection of occurring fires has high real world implications. Whereas a major research focus lies on the detection of outdoor fires, indoor camera-based fire detection is still an open issue. Cameras in combination with computer vision helps to detect flames and smoke more quickly than conventional fire detectors. In this work, we present a computer vision-based smoke detection algorithm based on contrast changes and a multi-step classification. This work accelerates computer vision-based fire detection considerably in comparison with classical indoor-fire detection.

X-Corner Detection for Camera Calibration Using Saddle Points

This paper discusses a corner detection algorithm for camera calibration. Calibration is a necessary step in many computer vision and image processing applications. Robust corner detection for an image of a checkerboard is required to determine intrinsic and extrinsic parameters. In this paper, an algorithm for fully automatic and robust X-corner detection is presented. Checkerboard corner points are automatically found in each image without user interaction or any prior information regarding the number of rows or columns. The approach represents each X-corner with a quadratic fitting function. Using the fact that the X-corners are saddle points, the coefficients in the fitting function are used to identify each corner location. The automation of this process greatly simplifies calibration. Our method is robust against noise and different camera orientations. Experimental analysis shows the accuracy of our method using actual images acquired at different camera locations and orientations.

Day/Night Detector for Vehicle Tracking in Traffic Monitoring Systems

Recently, traffic monitoring has attracted the attention of computer vision researchers. Many algorithms have been developed to detect and track moving vehicles. In fact, vehicle tracking in daytime and in nighttime cannot be approached with the same techniques, due to the extreme different illumination conditions. Consequently, traffic-monitoring systems are in need of having a component to differentiate between daytime and nighttime scenes. In this paper, a HSV-based day/night detector is proposed for traffic monitoring scenes. The detector employs the hue-histogram and the value-histogram on the top half of the image frame. Experimental results show that the extraction of the brightness features along with the color features within the top region of the image is effective for classifying traffic scenes. In addition, the detector achieves high precision and recall rates along with it is feasible for real time applications.

Multi-Layer Multi-Feature Background Subtraction Using Codebook Model Framework

Background modeling and subtraction in video analysis has been widely used as an effective method for moving objects detection in many computer vision applications. Recently, a large number of approaches have been developed to tackle different types of challenges in this field. However, the dynamic background and illumination variations are the most frequently occurred problems in the practical situation. This paper presents a favorable two-layer model based on codebook algorithm incorporated with local binary pattern (LBP) texture measure, targeted for handling dynamic background and illumination variation problems. More specifically, the first layer is designed by block-based codebook combining with LBP histogram and mean value of each RGB color channel. Because of the invariance of the LBP features with respect to monotonic gray-scale changes, this layer can produce block wise detection results with considerable tolerance of illumination variations. The pixel-based codebook is employed to reinforce the precision from the output of the first layer which is to eliminate false positives further. As a result, the proposed approach can greatly promote the accuracy under the circumstances of dynamic background and illumination changes. Experimental results on several popular background subtraction datasets demonstrate very competitive performance compared to previous models.

Object Motion Tracking Based On Color Detection for Android Devices

This paper presents the development of a robot car that can track the motion of an object by detecting its color through an Android device. The employed computer vision algorithm uses the OpenCV library, which is embedded into an Android application of a smartphone, for manipulating the captured image of the object. The captured image of the object is subjected to color conversion and is transformed to a binary image for further processing after color filtering. The desired object is clearly determined after removing pixel noise by applying image morphology operations and contour definition. Finally, the area and the center of the object are determined so that object’s motion to be tracked. The smartphone application has been placed on a robot car and transmits by Bluetooth to an Arduino assembly the motion directives so that to follow objects of a specified color. The experimental evaluation of the proposed algorithm shows reliable color detection and smooth tracking characteristics.

Human Motion Capture: New Innovations in the Field of Computer Vision

Human motion capture has become one of the major area of interest in the field of computer vision. Some of the major application areas that have been rapidly evolving include the advanced human interfaces, virtual reality and security/surveillance systems. This study provides a brief overview of the techniques and applications used for the markerless human motion capture, which deals with analyzing the human motion in the form of mathematical formulations. The major contribution of this research is that it classifies the computer vision based techniques of human motion capture based on the taxonomy, and then breaks its down into four systematically different categories of tracking, initialization, pose estimation and recognition. The detailed descriptions and the relationships descriptions are given for the techniques of tracking and pose estimation. The subcategories of each process are further described. Various hypotheses have been used by the researchers in this domain are surveyed and the evolution of these techniques have been explained. It has been concluded in the survey that most researchers have focused on using the mathematical body models for the markerless motion capture.

Stereo Motion Tracking

Motion Tracking and Stereo Vision are complicated, albeit well-understood problems in computer vision. Existing softwares that combine the two approaches to perform stereo motion tracking typically employ complicated and computationally expensive procedures. The purpose of this study is to create a simple and effective solution capable of combining the two approaches. The study aims to explore a strategy to combine the two techniques of two-dimensional motion tracking using Kalman Filter; and depth detection of object using Stereo Vision. In conventional approaches objects in the scene of interest are observed using a single camera. However for Stereo Motion Tracking; the scene of interest is observed using video feeds from two calibrated cameras. Using two simultaneous measurements from the two cameras a calculation for the depth of the object from the plane containing the cameras is made. The approach attempts to capture the entire three-dimensional spatial information of each object at the scene and represent it through a software estimator object. In discrete intervals, the estimator tracks object motion in the plane parallel to plane containing cameras and updates the perpendicular distance value of the object from the plane containing the cameras as depth. The ability to efficiently track the motion of objects in three-dimensional space using a simplified approach could prove to be an indispensable tool in a variety of surveillance scenarios. The approach may find application from high security surveillance scenes such as premises of bank vaults, prisons or other detention facilities; to low cost applications in supermarkets and car parking lots.

High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform

Real time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Thus, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Edge detection is one of the basic building blocks of video and image processing applications. It is a common block in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.