People Counting in Transport Vehicles

Counting people from a video stream in a noisy environment is a challenging task. This project aims at developing a counting system for transport vehicles, integrated in a video surveillance product. This article presents a method for the detection and tracking of multiple faces in a video by using a model of first and second order local moments. An iterative process is used to estimate the position and shape of multiple faces in images, and to track them. the trajectories are then processed to count people entering and leaving the vehicle.

Bandwidth, Area Efficient and Target Device Independent DDR SDRAM Controller

The application of the synchronous dynamic random access memory (SDRAM) has gone beyond the scope of personal computers for quite a long time. It comes into hand whenever a big amount of low price and still high speed memory is needed. Most of the newly developed stand alone embedded devices in the field of image, video and sound processing take more and more use of it. The big amount of low price memory has its trade off – the speed. In order to take use of the full potential of the memory, an efficient controller is needed. Efficient stands for maximum random accesses to the memory both for reading and writing and less area after implementation. This paper proposes a target device independent DDR SDRAM pipelined controller and provides performance comparison with available solutions.

The Design and Development of Multimedia Pronunciation Learning Management System

The proposed Multimedia Pronunciation Learning Management System (MPLMS) in this study is a technology with profound potential for inducing improvement in pronunciation learning. The MPLMS optimizes the digitised phonetic symbols with the integration of text, sound and mouth movement video. The components are designed and developed in an online management system which turns the web to a dynamic user-centric collection of consistent and timely information for quality sustainable learning. The aim of this study is to design and develop the MPLMS which serves as an innovative tool to improve English pronunciation. This paper discusses the iterative methodology and the three-phase Alessi and Trollip model in the development of MPLMS. To align with the flexibility of the development of educational software, the iterative approach comprises plan, design, develop, evaluate and implement is followed. To ensure the instructional appropriateness of MPLMS, the instructional system design (ISD) model of Alessi and Trollip serves as a platform to guide the important instructional factors and process. It is expected that the results of future empirical research will support the efficacy of MPLMS and its place as the premier pronunciation learning system.

Dynamic Visualization on Student's Performance, Retention and Transfer of Procedural Learning

This study examined the effects of two dynamic visualizations on 60 Malaysian primary school student-s performance (time on task), retention and transference. The independent variables in this study were the two dynamic visualizations, the video and the animated instructions. The dependent variables were the gain score of performance, retention and transference. The results showed that the students in the animation group significantly outperformed the students in the video group in retention. There were no significant differences in terms of gain scores in the performance and transference among the animation and the video groups, although the scores were slightly higher in the animation group compared to the video group. The conclusion of this study is that the animation visualization is superior compared to the video in the retention for a procedural task.

Design of FIR Filter for Water Level Detection

This paper proposes a new design of spatial FIR filter to automatically detect water level from a video signal of various river surroundings. A new approach in this report applies "addition" of frames and a "horizontal" edge detector to distinguish water region and land region. Variance of each line of a filtered video frame is used as a feature value. The water level is recognized as a boundary line between the land region and the water region. Edge detection filter essentially demarcates between two distinctly different regions. However, the conventional filters are not automatically adaptive to detect water level in various lighting conditions of river scenery. An optimized filter is purposed so that the system becomes robust to changes of lighting condition. More reliability of the proposed system with the optimized filter is confirmed by accuracy of water level detection.

Introducing an Image Processing Base Idea for Outdoor Children Caring

In this paper application of artificial intelligence for baby and children caring is studied. Then a new idea for injury prevention and safety announcement is presented by using digital image processing. The paper presents the structure of the proposed system. The system determines the possibility of the dangers for children and babies in yards, gardens and swimming pools or etc. In the presented idea, multi camera System is used and receiver videos are processed to find the hazardous areas then the entrance of children and babies in the determined hazardous areas are analyzed. In this condition the system does the programmed action capture, produce alarm or tone or send message.

Subjective Assessment about Super Resolution Image Resolution

Super resolution (SR) technologies are now being applied to video to improve resolution. Some TV sets are now equipped with SR functions. However, it is not known if super resolution image reconstruction (SRR) for TV really works or not. Super resolution with non-linear signal processing (SRNL) has recently been proposed. SRR and SRNL are the only methods for processing video signals in real time. The results from subjective assessments of SSR and SRNL are described in this paper. SRR video was produced in simulations with quarter precision motion vectors and 100 iterations. These are ideal conditions for SRR. We found that the image quality of SRNL is better than that of SRR even though SRR was processed under ideal conditions.

Bitrate Reduction Using FMO for Video Streaming over Packet Networks

Flexible macroblock ordering (FMO), adopted in the H.264 standard, allows to partition all macroblocks (MBs) in a frame into separate groups of MBs called Slice Groups (SGs). FMO can not only support error-resilience, but also control the size of video packets for different network types. However, it is well-known that the number of bits required for encoding the frame is increased by adopting FMO. In this paper, we propose a novel algorithm that can reduce the bitrate overhead caused by utilizing FMO. In the proposed algorithm, all MBs are grouped in SGs based on the similarity of the transform coefficients. Experimental results show that our algorithm can reduce the bitrate as compared with conventional FMO.

Parameter Estimation for Viewing Rank Distribution of Video-on-Demand

Video-on-demand (VOD) is designed by using content delivery networks (CDN) to minimize the overall operational cost and to maximize scalability. Estimation of the viewing pattern (i.e., the relationship between the number of viewings and the ranking of VOD contents) plays an important role in minimizing the total operational cost and maximizing the performance of the VOD systems. In this paper, we have analyzed a large body of commercial VOD viewing data and found that the viewing rank distribution fits well with the parabolic fractal distribution. The weighted linear model fitting function is used to estimate the parameters (coefficients) of the parabolic fractal distribution. This paper presents an analytical basis for designing an optimal hierarchical VOD contents distribution system in terms of its cost and performance.

Action Recognition in Video Sequences using a Mealy Machine

In this paper the use of sequential machines for recognizing actions taken by the objects detected by a general tracking algorithm is proposed. The system may deal with the uncertainty inherent in medium-level vision data. For this purpose, fuzzification of input data is performed. Besides, this transformation allows to manage data independently of the tracking application selected and enables adding characteristics of the analyzed scenario. The representation of actions by means of an automaton and the generation of the input symbols for finite automaton depending on the object and action compared are described. The output of the comparison process between an object and an action is a numerical value that represents the membership of the object to the action. This value is computed depending on how similar the object and the action are. The work concludes with the application of the proposed technique to identify the behavior of vehicles in road traffic scenes.

A Robust Method for Hand Tracking Using Mean-shift Algorithm and Kalman Filter in Stereo Color Image Sequences

Real-time hand tracking is a challenging task in many computer vision applications such as gesture recognition. This paper proposes a robust method for hand tracking in a complex environment using Mean-shift analysis and Kalman filter in conjunction with 3D depth map. The depth information solve the overlapping problem between hands and face, which is obtained by passive stereo measuring based on cross correlation and the known calibration data of the cameras. Mean-shift analysis uses the gradient of Bhattacharyya coefficient as a similarity function to derive the candidate of the hand that is most similar to a given hand target model. And then, Kalman filter is used to estimate the position of the hand target. The results of hand tracking, tested on various video sequences, are robust to changes in shape as well as partial occlusion.

Neuro-fuzzy Classification System for Wireless-Capsule Endoscopic Images

In this research study, an intelligent detection system to support medical diagnosis and detection of abnormal lesions by processing endoscopic images is presented. The images used in this study have been obtained using the M2A Swallowable Imaging Capsule - a patented, video color-imaging disposable capsule. Schemes have been developed to extract texture features from the fuzzy texture spectra in the chromatic and achromatic domains for a selected region of interest from each color component histogram of endoscopic images. The implementation of an advanced fuzzy inference neural network which combines fuzzy systems and artificial neural networks and the concept of fusion of multiple classifiers dedicated to specific feature parameters have been also adopted in this paper. The achieved high detection accuracy of the proposed system has provided thus an indication that such intelligent schemes could be used as a supplementary diagnostic tool in endoscopy.

Opportunistic Routing with Secure Coded Wireless Multicast Using MAS Approach

Many Wireless Sensor Network (WSN) applications necessitate secure multicast services for the purpose of broadcasting delay sensitive data like video files and live telecast at fixed time-slot. This work provides a novel method to deal with end-to-end delay and drop rate of packets. Opportunistic Routing chooses a link based on the maximum probability of packet delivery ratio. Null Key Generation helps in authenticating packets to the receiver. Markov Decision Process based Adaptive Scheduling algorithm determines the time slot for packet transmission. Both theoretical analysis and simulation results show that the proposed protocol ensures better performance in terms of packet delivery ratio, average end-to-end delay and normalized routing overhead.

Enhancement of Stereo Video Pairs Using SDNs To Aid In 3D Reconstruction

This paper presents the results of enhancing images from a left and right stereo pair in order to increase the resolution of a 3D representation of a scene generated from that same pair. A new neural network structure known as a Self Delaying Dynamic Network (SDN) has been used to perform the enhancement. The advantage of SDNs over existing techniques such as bicubic interpolation is their ability to cope with motion and noise effects. SDNs are used to generate two high resolution images, one based on frames taken from the left view of the subject, and one based on the frames from the right. This new high resolution stereo pair is then processed by a disparity map generator. The disparity map generated is compared to two other disparity maps generated from the same scene. The first is a map generated from an original high resolution stereo pair and the second is a map generated using a stereo pair which has been enhanced using bicubic interpolation. The maps generated using the SDN enhanced pairs match more closely the target maps. The addition of extra noise into the input images is less problematic for the SDN system which is still able to out perform bicubic interpolation.

Efficient Mean Shift Clustering Using Exponential Integral Kernels

This paper presents a highly efficient algorithm for detecting and tracking humans and objects in video surveillance sequences. Mean shift clustering is applied on backgrounddifferenced image sequences. For efficiency, all calculations are performed on integral images. Novel corresponding exponential integral kernels are introduced to allow the application of nonuniform kernels for clustering, which dramatically increases robustness without giving up the efficiency of the integral data structures. Experimental results demonstrating the power of this approach are presented.

Blinking Characteristics and Corneal Staining in Different Soft Lens Materials

Background Contact lens (CL) wear can cause changes in blinking and corneal staining. Aims and Objectives To determine the effects of CL materials (HEMA and SiHy) on spontaneous blink rate, blinking patterns and corneal staining after 2 months of wear. Methods Ninety subjects in 3 groups (control, HEMA and SiHy) were assessed at baseline and 2-months. Blink rate was recorded using a video camera. Blinking patterns were assessed with digital camera and slit lamp biomicroscope. Corneal staining was graded using IER grading scale Results There were no significant differences in all parameters at baseline. At 2 months, CL wearers showed significant increment in average blink rate (F1.626, 47.141 = 7.250, p = 0.003; F2,58 = 6.240, p = 0.004) and corneal staining (χ2 2, n=30 = 31.921, p < 0.001; χ2 2, n=30 = 26.909, p < 0.001). Conclusion Blinking characteristics and corneal staining were not influence by soft CL materials.

Video Super-Resolution Using Classification ANN

In this study, a classification-based video super-resolution method using artificial neural network (ANN) is proposed to enhance low-resolution (LR) to high-resolution (HR) frames. The proposed method consists of four main steps: classification, motion-trace volume collection, temporal adjustment, and ANN prediction. A classifier is designed based on the edge properties of a pixel in the LR frame to identify the spatial information. To exploit the spatio-temporal information, a motion-trace volume is collected using motion estimation, which can eliminate unfathomable object motion in the LR frames. In addition, temporal lateral process is employed for volume adjustment to reduce unnecessary temporal features. Finally, ANN is applied to each class to learn the complicated spatio-temporal relationship between LR and HR frames. Simulation results show that the proposed method successfully improves both peak signal-to-noise ratio and perceptual quality.

Loop Back Connected Component Labeling Algorithm and Its Implementation in Detecting Face

In this study, a Loop Back Algorithm for component connected labeling for detecting objects in a digital image is presented. The approach is using loop back connected component labeling algorithm that helps the system to distinguish the object detected according to their label. Deferent than whole window scanning technique, this technique reduces the searching time for locating the object by focusing on the suspected object based on certain features defined. In this study, the approach was also implemented for a face detection system. Face detection system is becoming interesting research since there are many devices or systems that require detecting the face for certain purposes. The input can be from still image or videos, therefore the sub process of this system has to be simple, efficient and accurate to give a good result.

Abrupt Scene Change Detection

A number of automated shot-change detection methods for indexing a video sequence to facilitate browsing and retrieval have been proposed in recent years. This paper emphasizes on the simulation of video shot boundary detection using one of the methods of the color histogram wherein scaling of the histogram metrics is an added feature. The difference between the histograms of two consecutive frames is evaluated resulting in the metrics. Further scaling of the metrics is performed to avoid ambiguity and to enable the choice of apt threshold for any type of videos which involves minor error due to flashlight, camera motion, etc. Two sample videos are used here with resolution of 352 X 240 pixels using color histogram approach in the uncompressed media. An attempt is made for the retrieval of color video. The simulation is performed for the abrupt change in video which yields 90% recall and precision value.

Hand Gesture Recognition: Sign to Voice System (S2V)

Hand gesture is one of the typical methods used in sign language for non-verbal communication. It is most commonly used by people who have hearing or speech problems to communicate among themselves or with normal people. Various sign language systems have been developed by manufacturers around the globe but they are neither flexible nor cost-effective for the end users. This paper presents a system prototype that is able to automatically recognize sign language to help normal people to communicate more effectively with the hearing or speech impaired people. The Sign to Voice system prototype, S2V, was developed using Feed Forward Neural Network for two-sequence signs detection. Different sets of universal hand gestures were captured from video camera and utilized to train the neural network for classification purpose. The experimental results have shown that neural network has achieved satisfactory result for sign-to-voice translation.