Abstract: Counting people from a video stream in a noisy environment is a challenging task. This project aims at developing a counting system for transport vehicles, integrated in a video surveillance product. This article presents a method for the detection and tracking of multiple faces in a video by using a model of first and second order local moments. An iterative process is used to estimate the position and shape of multiple faces in images, and to track them. the trajectories are then processed to count people entering and leaving the vehicle.
Abstract: The application of the synchronous dynamic random
access memory (SDRAM) has gone beyond the scope of personal
computers for quite a long time. It comes into hand whenever a big
amount of low price and still high speed memory is needed. Most of
the newly developed stand alone embedded devices in the field of
image, video and sound processing take more and more use of it. The
big amount of low price memory has its trade off – the speed. In
order to take use of the full potential of the memory, an efficient
controller is needed. Efficient stands for maximum random accesses
to the memory both for reading and writing and less area after
implementation. This paper proposes a target device independent
DDR SDRAM pipelined controller and provides performance
comparison with available solutions.
Abstract: The proposed Multimedia Pronunciation Learning
Management System (MPLMS) in this study is a technology with
profound potential for inducing improvement in pronunciation
learning. The MPLMS optimizes the digitised phonetic symbols with
the integration of text, sound and mouth movement video. The
components are designed and developed in an online management
system which turns the web to a dynamic user-centric collection of
consistent and timely information for quality sustainable learning.
The aim of this study is to design and develop the MPLMS which
serves as an innovative tool to improve English pronunciation. This
paper discusses the iterative methodology and the three-phase Alessi
and Trollip model in the development of MPLMS. To align with the
flexibility of the development of educational software, the iterative
approach comprises plan, design, develop, evaluate and implement is
followed. To ensure the instructional appropriateness of MPLMS, the
instructional system design (ISD) model of Alessi and Trollip serves
as a platform to guide the important instructional factors and process.
It is expected that the results of future empirical research will support
the efficacy of MPLMS and its place as the premier pronunciation
learning system.
Abstract: This study examined the effects of two dynamic
visualizations on 60 Malaysian primary school student-s performance
(time on task), retention and transference. The independent variables
in this study were the two dynamic visualizations, the video and the
animated instructions. The dependent variables were the gain score of
performance, retention and transference. The results showed that the
students in the animation group significantly outperformed the
students in the video group in retention. There were no significant
differences in terms of gain scores in the performance and
transference among the animation and the video groups, although the
scores were slightly higher in the animation group compared to the
video group. The conclusion of this study is that the animation
visualization is superior compared to the video in the retention for a
procedural task.
Abstract: This paper proposes a new design of spatial FIR
filter to automatically detect water level from a video signal of
various river surroundings. A new approach in this report applies
"addition" of frames and a "horizontal" edge detector to distinguish
water region and land region. Variance of each line of a filtered
video frame is used as a feature value. The water level is recognized
as a boundary line between the land region and the water region.
Edge detection filter essentially demarcates between two distinctly
different regions. However, the conventional filters are not
automatically adaptive to detect water level in various lighting
conditions of river scenery. An optimized filter is purposed so that
the system becomes robust to changes of lighting condition. More
reliability of the proposed system with the optimized filter is
confirmed by accuracy of water level detection.
Abstract: In this paper application of artificial intelligence for
baby and children caring is studied. Then a new idea for injury
prevention and safety announcement is presented by using digital
image processing. The paper presents the structure of the proposed
system. The system determines the possibility of the dangers for
children and babies in yards, gardens and swimming pools or etc. In
the presented idea, multi camera System is used and receiver videos
are processed to find the hazardous areas then the entrance of
children and babies in the determined hazardous areas are analyzed.
In this condition the system does the programmed action capture,
produce alarm or tone or send message.
Abstract: Super resolution (SR) technologies are now being
applied to video to improve resolution. Some TV sets are now
equipped with SR functions. However, it is not known if super
resolution image reconstruction (SRR) for TV really works or not.
Super resolution with non-linear signal processing (SRNL) has
recently been proposed. SRR and SRNL are the only methods for
processing video signals in real time. The results from subjective
assessments of SSR and SRNL are described in this paper. SRR video
was produced in simulations with quarter precision motion vectors and
100 iterations. These are ideal conditions for SRR. We found that the
image quality of SRNL is better than that of SRR even though SRR
was processed under ideal conditions.
Abstract: Flexible macroblock ordering (FMO), adopted in the
H.264 standard, allows to partition all macroblocks (MBs) in a frame
into separate groups of MBs called Slice Groups (SGs). FMO can not
only support error-resilience, but also control the size of video packets
for different network types. However, it is well-known that the number
of bits required for encoding the frame is increased by adopting FMO.
In this paper, we propose a novel algorithm that can reduce the bitrate
overhead caused by utilizing FMO. In the proposed algorithm, all MBs
are grouped in SGs based on the similarity of the transform
coefficients. Experimental results show that our algorithm can reduce
the bitrate as compared with conventional FMO.
Abstract: Video-on-demand (VOD) is designed by using content delivery networks (CDN) to minimize the overall operational cost and to maximize scalability. Estimation of the viewing pattern (i.e., the relationship between the number of viewings and the ranking of VOD contents) plays an important role in minimizing the total operational cost and maximizing the performance of the VOD systems. In this paper, we have analyzed a large body of commercial VOD viewing data and found that the viewing rank distribution fits well with the parabolic fractal distribution. The weighted linear model fitting function is used to estimate the parameters (coefficients) of the parabolic fractal distribution. This paper presents an analytical basis for designing an optimal hierarchical VOD contents distribution system in terms of its cost and performance.
Abstract: In this paper the use of sequential machines for recognizing actions taken by the objects detected by a general tracking algorithm is proposed. The system may deal with the uncertainty inherent in medium-level vision data. For this purpose, fuzzification of input data is performed. Besides, this transformation allows to manage data independently of the tracking application selected and enables adding characteristics of the analyzed scenario. The representation of actions by means of an automaton and the generation of the input symbols for finite automaton depending on the object and action compared are described. The output of the comparison process between an object and an action is a numerical value that represents the membership of the object to the action. This value is computed depending on how similar the object and the action are. The work concludes with the application of the proposed technique to identify the behavior of vehicles in road traffic scenes.
Abstract: Real-time hand tracking is a challenging task in many
computer vision applications such as gesture recognition. This paper
proposes a robust method for hand tracking in a complex environment
using Mean-shift analysis and Kalman filter in conjunction with 3D
depth map. The depth information solve the overlapping problem
between hands and face, which is obtained by passive stereo measuring
based on cross correlation and the known calibration data of
the cameras. Mean-shift analysis uses the gradient of Bhattacharyya
coefficient as a similarity function to derive the candidate of the hand
that is most similar to a given hand target model. And then, Kalman
filter is used to estimate the position of the hand target. The results
of hand tracking, tested on various video sequences, are robust to
changes in shape as well as partial occlusion.
Abstract: In this research study, an intelligent detection system
to support medical diagnosis and detection of abnormal lesions by
processing endoscopic images is presented. The images used in this
study have been obtained using the M2A Swallowable Imaging
Capsule - a patented, video color-imaging disposable capsule.
Schemes have been developed to extract texture features from the
fuzzy texture spectra in the chromatic and achromatic domains for a
selected region of interest from each color component histogram of
endoscopic images. The implementation of an advanced fuzzy
inference neural network which combines fuzzy systems and
artificial neural networks and the concept of fusion of multiple
classifiers dedicated to specific feature parameters have been also
adopted in this paper. The achieved high detection accuracy of the
proposed system has provided thus an indication that such intelligent
schemes could be used as a supplementary diagnostic tool in
endoscopy.
Abstract: Many Wireless Sensor Network (WSN) applications necessitate secure multicast services for the purpose of broadcasting delay sensitive data like video files and live telecast at fixed time-slot. This work provides a novel method to deal with end-to-end delay and drop rate of packets. Opportunistic Routing chooses a link based on the maximum probability of packet delivery ratio. Null Key Generation helps in authenticating packets to the receiver. Markov Decision Process based Adaptive Scheduling algorithm determines the time slot for packet transmission. Both theoretical analysis and simulation results show that the proposed protocol ensures better performance in terms of packet delivery ratio, average end-to-end delay and normalized routing overhead.
Abstract: This paper presents the results of enhancing images from a left and right stereo pair in order to increase the resolution of a 3D representation of a scene generated from that same pair. A new neural network structure known as a Self Delaying Dynamic Network (SDN) has been used to perform the enhancement. The advantage of SDNs over existing techniques such as bicubic interpolation is their ability to cope with motion and noise effects. SDNs are used to generate two high resolution images, one based on frames taken from the left view of the subject, and one based on the frames from the right. This new high resolution stereo pair is then processed by a disparity map generator. The disparity map generated is compared to two other disparity maps generated from the same scene. The first is a map generated from an original high resolution stereo pair and the second is a map generated using a stereo pair which has been enhanced using bicubic interpolation. The maps generated using the SDN enhanced pairs match more closely the target maps. The addition of extra noise into the input images is less problematic for the SDN system which is still able to out perform bicubic interpolation.
Abstract: This paper presents a highly efficient algorithm for detecting and tracking humans and objects in video surveillance sequences. Mean shift clustering is applied on backgrounddifferenced image sequences. For efficiency, all calculations are performed on integral images. Novel corresponding exponential integral kernels are introduced to allow the application of nonuniform kernels for clustering, which dramatically increases robustness without giving up the efficiency of the integral data structures. Experimental results demonstrating the power of this approach are presented.
Abstract: Background Contact lens (CL) wear can cause
changes in blinking and corneal staining. Aims and Objectives To
determine the effects of CL materials (HEMA and SiHy) on
spontaneous blink rate, blinking patterns and corneal staining after 2
months of wear. Methods Ninety subjects in 3 groups (control,
HEMA and SiHy) were assessed at baseline and 2-months. Blink rate
was recorded using a video camera. Blinking patterns were assessed
with digital camera and slit lamp biomicroscope. Corneal staining
was graded using IER grading scale Results There were no significant
differences in all parameters at baseline. At 2 months, CL wearers
showed significant increment in average blink rate (F1.626, 47.141 =
7.250, p = 0.003; F2,58 = 6.240, p = 0.004) and corneal staining (χ2
2,
n=30 = 31.921, p < 0.001; χ2
2, n=30 = 26.909, p < 0.001). Conclusion
Blinking characteristics and corneal staining were not influence by
soft CL materials.
Abstract: In this study, a classification-based video
super-resolution method using artificial neural network (ANN) is
proposed to enhance low-resolution (LR) to high-resolution (HR)
frames. The proposed method consists of four main steps:
classification, motion-trace volume collection, temporal adjustment,
and ANN prediction. A classifier is designed based on the edge
properties of a pixel in the LR frame to identify the spatial information.
To exploit the spatio-temporal information, a motion-trace volume is
collected using motion estimation, which can eliminate unfathomable
object motion in the LR frames. In addition, temporal lateral process is
employed for volume adjustment to reduce unnecessary temporal
features. Finally, ANN is applied to each class to learn the complicated
spatio-temporal relationship between LR and HR frames. Simulation
results show that the proposed method successfully improves both
peak signal-to-noise ratio and perceptual quality.
Abstract: In this study, a Loop Back Algorithm for component
connected labeling for detecting objects in a digital image is
presented. The approach is using loop back connected component
labeling algorithm that helps the system to distinguish the object
detected according to their label. Deferent than whole window
scanning technique, this technique reduces the searching time for
locating the object by focusing on the suspected object based on
certain features defined. In this study, the approach was also
implemented for a face detection system. Face detection system is
becoming interesting research since there are many devices or
systems that require detecting the face for certain purposes. The input
can be from still image or videos, therefore the sub process of this
system has to be simple, efficient and accurate to give a good result.
Abstract: A number of automated shot-change detection
methods for indexing a video sequence to facilitate browsing and
retrieval have been proposed in recent years. This paper emphasizes
on the simulation of video shot boundary detection using one of the
methods of the color histogram wherein scaling of the histogram
metrics is an added feature. The difference between the histograms of
two consecutive frames is evaluated resulting in the metrics. Further
scaling of the metrics is performed to avoid ambiguity and to enable
the choice of apt threshold for any type of videos which involves
minor error due to flashlight, camera motion, etc. Two sample videos
are used here with resolution of 352 X 240 pixels using color
histogram approach in the uncompressed media. An attempt is made
for the retrieval of color video. The simulation is performed for the
abrupt change in video which yields 90% recall and precision value.
Abstract: Hand gesture is one of the typical methods used in
sign language for non-verbal communication. It is most commonly
used by people who have hearing or speech problems to
communicate among themselves or with normal people. Various sign
language systems have been developed by manufacturers around the
globe but they are neither flexible nor cost-effective for the end
users. This paper presents a system prototype that is able to
automatically recognize sign language to help normal people to
communicate more effectively with the hearing or speech impaired
people. The Sign to Voice system prototype, S2V, was developed
using Feed Forward Neural Network for two-sequence signs
detection. Different sets of universal hand gestures were captured
from video camera and utilized to train the neural network for
classification purpose. The experimental results have shown that
neural network has achieved satisfactory result for sign-to-voice
translation.