Abstract: Mobile robots are used in a large field of scenarios,
like exploring contaminated areas, repairing oil rigs under water,
finding survivors in collapsed buildings, etc. Currently, there is no
unified intuitive user interface (UI) to control such complex mobile
robots. As a consequence, some scenarios are done without the
exploitation of experience and intuition of human teleoperators.
A novel framework has been developed to embed a flexible and
modular UI into a complete 3-D virtual reality simulation system.
This new approach wants to access maximum benefits of human
operators. Sensor information received from the robot is prepared for
an intuitive visualization. Virtual reality metaphors support the
operator in his decisions. These metaphors are integrated into a real
time stereo video stream. This approach is not restricted to any
specific type of mobile robot and allows for the operation of different
robot types with a consistent concept and user interface.
Abstract: This paper proposes a novel spectrum sensing technique
for the digital video broadcasting-terrestrial (DVB-T) systems, which
utilizes the periodicity of pilot signals in the orthogonal frequency
division multiplexing (OFDM) symbols. The proposed scheme can
overcome the effect of the timing synchronization error by recorrelating
the correlation values in the same sample distances. The
numerical results demonstrate that the detection probability performance
of the proposed scheme outperforms that of the conventional
scheme when there exists a timing synchronization error.
Abstract: This paper presents the enhanced frame-based video coding scheme. The input source video to the enhanced frame-based video encoder consists of a rectangular-size video and shapes of arbitrarily-shaped objects on video frames. The rectangular frame texture is encoded by the conventional frame-based coding technique and the video object-s shape is encoded using the contour-based vertex coding. It is possible to achieve several useful content-based functionalities by utilizing the shape information in the bitstream at the cost of a very small overhead to the bitrate.
Abstract: Application of Expert System in the area of agriculture would take the form of Integrated Crop Management decision aids and would encompass water management, fertilizer management, crop protection systems and identification of implements. In order to remain competitive, the modern farmer often relies on agricultural specialists and advisors to provide information for decision-making. An expert system normally composed of a knowledge base (information, heuristics, etc.), inference engine (analyzes knowledge base), and end user interface (accepting inputs, generating outputs). Software named 'CROP-9-DSS' incorporating all modern features like, graphics, photos, video clippings etc. has been developed. This package will aid as a decision support system for identification of pest and diseases with control measures, fertilizer recommendation system, water management system and identification of farm implements for leading crops of Kerala (India) namely Coconut, Rice, Cashew, Pepper, Banana, four vegetables like Amaranthus, Bhindi, Brinjal and Cucurbits. 'CROP-9-DSS' will act as an expert system to agricultural officers, scientists in the field of agriculture and extension workers for decision-making and help them in suggesting suitable recommendations.
Abstract: This paper study the high-level modelling and design
of delta-sigma (ΔΣ) noise shapers for audio Digital-to-Analog
Converter (DAC) so as to eliminate the in-band Signal-to-Noise-
Ratio (SNR) degradation that accompany one channel mismatch in
audio signal. The converter combines a cascaded digital signal
interpolation, a noise-shaping single loop delta-sigma modulator with
a 5-bit quantizer resolution in the final stage. To reduce sensitivity of
Digital-to-Analog Converter (DAC) nonlinearities of the last stage, a
high pass second order Data Weighted Averaging (R2DWA) is
introduced. This paper presents a MATLAB description modelling
approach of the proposed DAC architecture with low distortion and
swing suppression integrator designs. The ΔΣ Modulator design can
be configured as a 3rd-order and allows 24-bit PCM at sampling rate
of 64 kHz for Digital Video Disc (DVD) audio application. The
modeling approach provides 139.38 dB of dynamic range for a 32
kHz signal band at -1.6 dBFS input signal level.
Abstract: In this work we will present a new approach for shot transition auto-detection. Our approach is based on the analysis of Spatio-Temporal Video Slice (STVS) edges extracted from videos. The proposed approach is capable to efficiently detect both abrupt shot transitions 'cuts' and gradual ones such as fade-in, fade-out and dissolve. Compared to other techniques, our method is distinguished by its high level of precision and speed. Those performances are obtained due to minimizing the problem of the boundary shot detection to a simple 2D image partitioning problem.
Abstract: This paper aims to propose a novel, robust, and simple method for obtaining a human 3D face model and camera pose (position and orientation) from a video sequence. Given a video sequence of a face recorded from an off-the-shelf digital camera, feature points used to define facial parts are tracked using the Active- Appearance Model (AAM). Then, the face-s 3D structure and camera pose of each video frame can be simultaneously calculated from the obtained point correspondences. This proposed method is primarily based on the combined approaches of Gradient Descent and Powell-s Multidimensional Minimization. Using this proposed method, temporarily occluded point including the case of self-occlusion does not pose a problem. As long as the point correspondences displayed in the video sequence have enough parallax, these missing points can still be reconstructed.
Abstract: Monitored 3-Dimensional (3D) video experience can be utilized as “feedback information” to fine tune the service parameters for providing a better service to the demanding 3D service customers. The 3D video experience which includes both video quality and depth perception is influenced by several contextual and content related factors (e.g., ambient illumination condition, content characteristics, etc) due to the complex nature of the 3D video. Therefore, effective factors on this experience should be utilized while assessing it. In this paper, structural information of the depth map sequences of the 3D video is considered as content related factor effective on the depth perception assessment. Cartoon-like filter is utilized to abstract the significant depth levels in the depth map sequences to determine the structural information. Moreover, subjective experiments are conducted using 3D videos associated with cartoon-like depth map sequences to investigate the effectiveness of ambient illumination condition, which is a contextual factor, on depth perception. Using the knowledge gained through this study, 3D video experience metrics can be developed to deliver better service to the 3D video service users.
Abstract: In this paper, we propose an effective relay
communication for layered video transmission as an alternative to
make the most of limited resources in a wireless communication
network where loss often occurs. Relaying brings stable multimedia
services to end clients, compared to multiple description coding
(MDC). Also, retransmission of only parity data about one or more
video layer using channel coder to the end client of the relay device is
paramount to the robustness of the loss situation. Using these
methods in resource-constrained environments, such as real-time user
created content (UCC) with layered video transmission, can provide
high-quality services even in a poor communication environment.
Minimal services are also possible. The mathematical analysis shows
that the proposed method reduced the probability of GOP loss rate
compared to MDC and raptor code without relay. The GOP loss rate
is about zero, while MDC and raptor code without relay have a GOP
loss rate of 36% and 70% in case of 10% frame loss rate.
Abstract: Naive Bayes Nearest Neighbor (NBNN) and its variants, i,e., local NBNN and the NBNN kernels, are local feature-based classifiers that have achieved impressive performance in image classification. By exploiting instance-to-class (I2C) distances (instance means image/video in image/video classification), they avoid quantization errors of local image descriptors in the bag of words (BoW) model. However, the performances of NBNN, local NBNN and the NBNN kernels have not been validated on video analysis. In this paper, we introduce these three classifiers into human action recognition and conduct comprehensive experiments on the benchmark KTH and the realistic HMDB datasets. The results shows that those I2C based classifiers consistently outperform the SVM classifier with the BoW model.
Abstract: A multimedia presentation system refers to the integration of a multimedia database with a presentation manager which has the functionality of content selection, organization and playout of multimedia presentations. It requires high performance of involved system components. Starting from multimedia information capture until the presentation delivery, high performance tools are required for accessing, manipulating, storing and retrieving these segments, for transferring and delivering them in a presentation terminal according to a playout order. The organization of presentations is a complex task in that the display order of presentation contents (in time and space) must be specified. A multimedia presentation contains audio, video, images and text media types. The critical decisions for presentation construction include what the contents are, how the contents are organized, and once the decision is made on the organization of the contents of the presentation, it must be conveyed to the end user in the correct organizational order and in a timely fashion. This paper introduces a framework for specification of multimedia presentations and describes the design of sample presentations using this framework from a multimedia database.
Abstract: Asynchronous Transfer Mode (ATM) is widely used
in telecommunications systems to send data, video and voice at a
very high speed. In ATM network optimizing the bandwidth through
dynamic routing is an important consideration. Previous research
work shows that traditional optimization heuristics result in suboptimal
solution. In this paper we have explored non-traditional
optimization technique. We propose comparison of two such
algorithms - Genetic Algorithm (GA) and Tabu search (TS), based on
non-traditional Optimization approach, for solving the dynamic
routing problem in ATM networks which in return will optimize the
bandwidth. The optimized bandwidth could mean that some
attractive business applications would become feasible such as high
speed LAN interconnection, teleconferencing etc. We have also
performed a comparative study of the selection mechanisms in GA
and listed the best selection mechanism and a new initialization
technique which improves the efficiency of the GA.
Abstract: An effective visual error concealment method has been presented by employing a robust rotation, scale, and translation (RST) invariant partial patch matching model (RSTI-PPMM) and
exemplar-based inpainting. While the proposed robust and inherently
feature-enhanced texture synthesis approach ensures the generation
of excellent and perceptually plausible visual error concealment results, the outlier pruning property guarantees the significant quality improvements, both quantitatively and qualitatively. No intermediate
user-interaction is required for the pre-segmented media and the
presented method follows a bootstrapping approach for an automatic
visual loss recovery and the image and video error concealment.
Abstract: An Advance Driver Assistance System (ADAS) is a computer system on board a vehicle which is used to reduce the risk of vehicular accidents by monitoring factors relating to the driver, vehicle and environment and taking some action when a risk is identified. Much work has been done on assessing vehicle and environmental state but there is still comparatively little published work that tackles the problem of driver state. Visual attention is one such driver state. In fact, some researchers claim that lack of attention is the main cause of accidents as factors such as fatigue, alcohol or drug use, distraction and speeding all impair the driver-s capacity to pay attention to the vehicle and road conditions [1]. This seems to imply that the main cause of accidents is inappropriate driver behaviour in cases where the driver is not giving full attention while driving. The work presented in this paper proposes an ADAS system which uses an image based template matching algorithm to detect if a driver is failing to observe particular windscreen cells. This is achieved by dividing the windscreen into 24 uniform cells (4 rows of 6 columns) and matching video images of the driver-s left eye with eye-gesture templates drawn from images of the driver looking at the centre of each windscreen cell. The main contribution of this paper is to assess the accuracy of this approach using Receiver Operating Characteristic analysis. The results of our evaluation give a sensitivity value of 84.3% and a specificity value of 85.0% for the eye-gesture template approach indicating that it may be useful for driver point of regard determinations.
Abstract: One of the most growing areas in the embedded community is multimedia devices. Multimedia devices incorporate a number of complicated functions for their operation, like motion estimation. A multitude of different implementations have been proposed to reduce motion estimation complexity, such as spiral search. We have studied the implementations of spiral search and identified areas of improvement. We propose a modified spiral search algorithm, with lower computational complexity compared to the original spiral search. We have implemented our algorithm on an embedded ARM based architecture, with custom memory hierarchy. The resulting system yields energy consumption reduction up to 64% and performance increase up to 77%, with a small penalty of 2.3 dB, in average, of video quality compared with the original spiral search algorithm.
Abstract: A simple but effective digital watermarking scheme
utilizing a context adaptive variable length coding (CAVLC) method
is presented for wireless communication system. In the proposed
approach, the watermark bits are embedded in the final non-zero
quantized coefficient of each DCT block, thereby yielding a potential
reduction in the length of the coded block. As a result, the
watermarking scheme not only provides the means to check the
authenticity and integrity of the video stream, but also improves the
compression ratio and therefore reduces both the transmission time
and the storage space requirements of the coded video sequence. The
results confirm that the proposed scheme enables the detection of
malicious tampering attacks and reduces the size of the coded H.264
file. Therefore, the current study is feasible to apply in the video
applications of wireless communication such as 3G system
Abstract: This paper proposes two novel schemes for pilot-aided
integer frequency offset (IFO) estimation in orthogonal frequency
division multiplexing (OFDM)-based digital video broadcastingterrestrial
(DVB-T) systems. The conventional scheme proposed for
estimating the IFO uses only partial information of combinations
that pilots can provide, which stems from a rigorous assumption
that the channel responses of pilots used for estimating the IFO
change very rapidly. Thus, in this paper, we propose the novel IFO
estimation schemes exploiting all information of combinations that
pilots can provide to improve the performance of IFO estimation.
The simulation results show that the proposed schemes are highly
accurate in terms of the IFO detection probability.
Abstract: Since 2004, we have been developing an in-situ storage image sensor (ISIS) that captures more than 100 consecutive images at a frame rate of 10 Mfps with ultra-high sensitivity as well as the video camera for use with this ISIS. Currently, basic research is continuing in an attempt to increase the frame rate up to 100 Mfps and above. In order to suppress electro-magnetic noise at such high frequency, a digital-noiseless imaging transfer scheme has been developed utilizing solely sinusoidal driving voltages. This paper presents highly efficient-yet-accurate expressions to estimate attenuation as well as phase delay of driving voltages through RC networks of an ultra-high-speed image sensor. Elmore metric for a fundamental RC chain is employed as the first-order approximation. By application of dimensional analysis to SPICE data, we found a simple expression that significantly improves the accuracy of the approximation. Similarly, another simple closed-form model to estimate phase delay through fundamental RC networks is also obtained. Estimation error of both expressions is much less than previous works, only less 2% for most of the cases . The framework of this analysis can be extended to address similar issues of other VLSI structures.
Abstract: Recent years have witnessed the rapid development of
the Internet and telecommunication techniques. Information security
is becoming more and more important. Applications such as covert
communication, copyright protection, etc, stimulate the research of
information hiding techniques. Traditionally, encryption is used to
realize the communication security. However, important information
is not protected once decoded. Steganography is the art and science
of communicating in a way which hides the existence of the communication.
Important information is firstly hidden in a host data, such
as digital image, video or audio, etc, and then transmitted secretly
to the receiver.In this paper a data hiding model with high security
features combining both cryptography using finite state sequential
machine and image based steganography technique for communicating
information more securely between two locations is proposed.
The authors incorporated the idea of secret key for authentication
at both ends in order to achieve high level of security. Before the
embedding operation the secret information has been encrypted with
the help of finite-state sequential machine and segmented in different
parts. The cover image is also segmented in different objects through
normalized cut.Each part of the encoded secret information has been
embedded with the help of a novel image steganographic method
(PMM) on different cuts of the cover image to form different stego
objects. Finally stego image is formed by combining different stego
objects and transmit to the receiver side. At the receiving end different
opposite processes should run to get the back the original secret
message.
Abstract: In this paper we propose a method which improves the efficiency of video coding. Our method combines an adaptive GOP (group of pictures) structure and the shot cut detection. We have analyzed different approaches for shot cut detection with aim to choose the most appropriate one. The next step is to situate N frames to the positions of detected cuts during the process of video encoding. Finally the efficiency of the proposed method is confirmed by simulations and the obtained results are compared with fixed GOP structures of sizes 4, 8, 12, 16, 32, 64, 128 and GOP structure with length of entire video. Proposed method achieved the gain in bit rate from 0.37% to 50.59%, while providing PSNR (Peak Signal-to-Noise Ratio) gain from 1.33% to 0.26% in comparison to simulated fixed GOP structures.