A Universal Approach for the Intuitive Control of Mobile Robots using an AR/VR-based Interface

Mobile robots are used in a large field of scenarios, like exploring contaminated areas, repairing oil rigs under water, finding survivors in collapsed buildings, etc. Currently, there is no unified intuitive user interface (UI) to control such complex mobile robots. As a consequence, some scenarios are done without the exploitation of experience and intuition of human teleoperators. A novel framework has been developed to embed a flexible and modular UI into a complete 3-D virtual reality simulation system. This new approach wants to access maximum benefits of human operators. Sensor information received from the robot is prepared for an intuitive visualization. Virtual reality metaphors support the operator in his decisions. These metaphors are integrated into a real time stereo video stream. This approach is not restricted to any specific type of mobile robot and allows for the operation of different robot types with a consistent concept and user interface.

A Novel Spectrum Sensing Scheme Based on Periodicity of DVB-T Pilot Signals

This paper proposes a novel spectrum sensing technique for the digital video broadcasting-terrestrial (DVB-T) systems, which utilizes the periodicity of pilot signals in the orthogonal frequency division multiplexing (OFDM) symbols. The proposed scheme can overcome the effect of the timing synchronization error by recorrelating the correlation values in the same sample distances. The numerical results demonstrate that the detection probability performance of the proposed scheme outperforms that of the conventional scheme when there exists a timing synchronization error.

Enhanced Frame-based Video Coding to Support Content-based Functionalities

This paper presents the enhanced frame-based video coding scheme. The input source video to the enhanced frame-based video encoder consists of a rectangular-size video and shapes of arbitrarily-shaped objects on video frames. The rectangular frame texture is encoded by the conventional frame-based coding technique and the video object-s shape is encoded using the contour-based vertex coding. It is possible to achieve several useful content-based functionalities by utilizing the shape information in the bitstream at the cost of a very small overhead to the bitrate.

Decision Support System “Crop-9-DSS“ for Identified Crops

Application of Expert System in the area of agriculture would take the form of Integrated Crop Management decision aids and would encompass water management, fertilizer management, crop protection systems and identification of implements. In order to remain competitive, the modern farmer often relies on agricultural specialists and advisors to provide information for decision-making. An expert system normally composed of a knowledge base (information, heuristics, etc.), inference engine (analyzes knowledge base), and end user interface (accepting inputs, generating outputs). Software named 'CROP-9-DSS' incorporating all modern features like, graphics, photos, video clippings etc. has been developed. This package will aid as a decision support system for identification of pest and diseases with control measures, fertilizer recommendation system, water management system and identification of farm implements for leading crops of Kerala (India) namely Coconut, Rice, Cashew, Pepper, Banana, four vegetables like Amaranthus, Bhindi, Brinjal and Cucurbits. 'CROP-9-DSS' will act as an expert system to agricultural officers, scientists in the field of agriculture and extension workers for decision-making and help them in suggesting suitable recommendations.

A 24-Bit, 8.1-MS/s D/A Converter for Audio Baseband Channel Applications

This paper study the high-level modelling and design of delta-sigma (ΔΣ) noise shapers for audio Digital-to-Analog Converter (DAC) so as to eliminate the in-band Signal-to-Noise- Ratio (SNR) degradation that accompany one channel mismatch in audio signal. The converter combines a cascaded digital signal interpolation, a noise-shaping single loop delta-sigma modulator with a 5-bit quantizer resolution in the final stage. To reduce sensitivity of Digital-to-Analog Converter (DAC) nonlinearities of the last stage, a high pass second order Data Weighted Averaging (R2DWA) is introduced. This paper presents a MATLAB description modelling approach of the proposed DAC architecture with low distortion and swing suppression integrator designs. The ΔΣ Modulator design can be configured as a 3rd-order and allows 24-bit PCM at sampling rate of 64 kHz for Digital Video Disc (DVD) audio application. The modeling approach provides 139.38 dB of dynamic range for a 32 kHz signal band at -1.6 dBFS input signal level.

Spatio-Temporal Video Slice Edges Analysis for Shot Transition Detection and Classification

In this work we will present a new approach for shot transition auto-detection. Our approach is based on the analysis of Spatio-Temporal Video Slice (STVS) edges extracted from videos. The proposed approach is capable to efficiently detect both abrupt shot transitions 'cuts' and gradual ones such as fade-in, fade-out and dissolve. Compared to other techniques, our method is distinguished by its high level of precision and speed. Those performances are obtained due to minimizing the problem of the boundary shot detection to a simple 2D image partitioning problem.

Face Reconstruction and Camera Pose Using Multi-dimensional Descent

This paper aims to propose a novel, robust, and simple method for obtaining a human 3D face model and camera pose (position and orientation) from a video sequence. Given a video sequence of a face recorded from an off-the-shelf digital camera, feature points used to define facial parts are tracked using the Active- Appearance Model (AAM). Then, the face-s 3D structure and camera pose of each video frame can be simultaneously calculated from the obtained point correspondences. This proposed method is primarily based on the combined approaches of Gradient Descent and Powell-s Multidimensional Minimization. Using this proposed method, temporarily occluded point including the case of self-occlusion does not pose a problem. As long as the point correspondences displayed in the video sequence have enough parallax, these missing points can still be reconstructed.

Cartoon Effect and Ambient Illumination Based Depth Perception Assessment of 3D Video

Monitored 3-Dimensional (3D) video experience can be utilized as “feedback information” to fine tune the service parameters for providing a better service to the demanding 3D service customers. The 3D video experience which includes both video quality and depth perception is influenced by several contextual and content related factors (e.g., ambient illumination condition, content characteristics, etc) due to the complex nature of the 3D video. Therefore, effective factors on this experience should be utilized while assessing it. In this paper, structural information of the depth map sequences of the 3D video is considered as content related factor effective on the depth perception assessment. Cartoon-like filter is utilized to abstract the significant depth levels in the depth map sequences to determine the structural information. Moreover, subjective experiments are conducted using 3D videos associated with cartoon-like depth map sequences to investigate the effectiveness of ambient illumination condition, which is a contextual factor, on depth perception. Using the knowledge gained through this study, 3D video experience metrics can be developed to deliver better service to the 3D video service users. 

Effective Relay Communication for Scalable Video Transmission

In this paper, we propose an effective relay communication for layered video transmission as an alternative to make the most of limited resources in a wireless communication network where loss often occurs. Relaying brings stable multimedia services to end clients, compared to multiple description coding (MDC). Also, retransmission of only parity data about one or more video layer using channel coder to the end client of the relay device is paramount to the robustness of the loss situation. Using these methods in resource-constrained environments, such as real-time user created content (UCC) with layered video transmission, can provide high-quality services even in a poor communication environment. Minimal services are also possible. The mathematical analysis shows that the proposed method reduced the probability of GOP loss rate compared to MDC and raptor code without relay. The GOP loss rate is about zero, while MDC and raptor code without relay have a GOP loss rate of 36% and 70% in case of 10% frame loss rate.

Evaluation of Classifiers Based On I2C Distance for Action Recognition

Naive Bayes Nearest Neighbor (NBNN) and its variants, i,e., local NBNN and the NBNN kernels, are local feature-based classifiers that have achieved impressive performance in image classification. By exploiting instance-to-class (I2C) distances (instance means image/video in image/video classification), they avoid quantization errors of local image descriptors in the bag of words (BoW) model. However, the performances of NBNN, local NBNN and the NBNN kernels have not been validated on video analysis. In this paper, we introduce these three classifiers into human action recognition and conduct comprehensive experiments on the benchmark KTH and the realistic HMDB datasets. The results shows that those I2C based classifiers consistently outperform the SVM classifier with the BoW model.

Specification of Attributes of a Multimedia Presentation for Presentation Manager

A multimedia presentation system refers to the integration of a multimedia database with a presentation manager which has the functionality of content selection, organization and playout of multimedia presentations. It requires high performance of involved system components. Starting from multimedia information capture until the presentation delivery, high performance tools are required for accessing, manipulating, storing and retrieving these segments, for transferring and delivering them in a presentation terminal according to a playout order. The organization of presentations is a complex task in that the display order of presentation contents (in time and space) must be specified. A multimedia presentation contains audio, video, images and text media types. The critical decisions for presentation construction include what the contents are, how the contents are organized, and once the decision is made on the organization of the contents of the presentation, it must be conveyed to the end user in the correct organizational order and in a timely fashion. This paper introduces a framework for specification of multimedia presentations and describes the design of sample presentations using this framework from a multimedia database.

Bandwidth Optimization through Dynamic Routing in ATM Networks: Genetic Algorithm and Tabu Search Approach

Asynchronous Transfer Mode (ATM) is widely used in telecommunications systems to send data, video and voice at a very high speed. In ATM network optimizing the bandwidth through dynamic routing is an important consideration. Previous research work shows that traditional optimization heuristics result in suboptimal solution. In this paper we have explored non-traditional optimization technique. We propose comparison of two such algorithms - Genetic Algorithm (GA) and Tabu search (TS), based on non-traditional Optimization approach, for solving the dynamic routing problem in ATM networks which in return will optimize the bandwidth. The optimized bandwidth could mean that some attractive business applications would become feasible such as high speed LAN interconnection, teleconferencing etc. We have also performed a comparative study of the selection mechanisms in GA and listed the best selection mechanism and a new initialization technique which improves the efficiency of the GA.

Effective Image and Video Error Concealment using RST-Invariant Partial Patch Matching Model and Exemplar-based Inpainting

An effective visual error concealment method has been presented by employing a robust rotation, scale, and translation (RST) invariant partial patch matching model (RSTI-PPMM) and exemplar-based inpainting. While the proposed robust and inherently feature-enhanced texture synthesis approach ensures the generation of excellent and perceptually plausible visual error concealment results, the outlier pruning property guarantees the significant quality improvements, both quantitatively and qualitatively. No intermediate user-interaction is required for the pre-segmented media and the presented method follows a bootstrapping approach for an automatic visual loss recovery and the image and video error concealment.

Analysis of Driver Point of Regard Determinations with Eye-Gesture Templates Using Receiver Operating Characteristic

An Advance Driver Assistance System (ADAS) is a computer system on board a vehicle which is used to reduce the risk of vehicular accidents by monitoring factors relating to the driver, vehicle and environment and taking some action when a risk is identified. Much work has been done on assessing vehicle and environmental state but there is still comparatively little published work that tackles the problem of driver state. Visual attention is one such driver state. In fact, some researchers claim that lack of attention is the main cause of accidents as factors such as fatigue, alcohol or drug use, distraction and speeding all impair the driver-s capacity to pay attention to the vehicle and road conditions [1]. This seems to imply that the main cause of accidents is inappropriate driver behaviour in cases where the driver is not giving full attention while driving. The work presented in this paper proposes an ADAS system which uses an image based template matching algorithm to detect if a driver is failing to observe particular windscreen cells. This is achieved by dividing the windscreen into 24 uniform cells (4 rows of 6 columns) and matching video images of the driver-s left eye with eye-gesture templates drawn from images of the driver looking at the centre of each windscreen cell. The main contribution of this paper is to assess the accuracy of this approach using Receiver Operating Characteristic analysis. The results of our evaluation give a sensitivity value of 84.3% and a specificity value of 85.0% for the eye-gesture template approach indicating that it may be useful for driver point of regard determinations.

A Modified Spiral Search Algorithm and Its Embedded System Architecture Design

One of the most growing areas in the embedded community is multimedia devices. Multimedia devices incorporate a number of complicated functions for their operation, like motion estimation. A multitude of different implementations have been proposed to reduce motion estimation complexity, such as spiral search. We have studied the implementations of spiral search and identified areas of improvement. We propose a modified spiral search algorithm, with lower computational complexity compared to the original spiral search. We have implemented our algorithm on an embedded ARM based architecture, with custom memory hierarchy. The resulting system yields energy consumption reduction up to 64% and performance increase up to 77%, with a small penalty of 2.3 dB, in average, of video quality compared with the original spiral search algorithm.

Post-Compression Consideration in Video Watermarking for Wireless Communication

A simple but effective digital watermarking scheme utilizing a context adaptive variable length coding (CAVLC) method is presented for wireless communication system. In the proposed approach, the watermark bits are embedded in the final non-zero quantized coefficient of each DCT block, thereby yielding a potential reduction in the length of the coded block. As a result, the watermarking scheme not only provides the means to check the authenticity and integrity of the video stream, but also improves the compression ratio and therefore reduces both the transmission time and the storage space requirements of the coded video sequence. The results confirm that the proposed scheme enables the detection of malicious tampering attacks and reduces the size of the coded H.264 file. Therefore, the current study is feasible to apply in the video applications of wireless communication such as 3G system

Novel Schemes of Pilot-Aided Integer Frequency Offset Estimation for OFDM-Based DVB-T Systems

This paper proposes two novel schemes for pilot-aided integer frequency offset (IFO) estimation in orthogonal frequency division multiplexing (OFDM)-based digital video broadcastingterrestrial (DVB-T) systems. The conventional scheme proposed for estimating the IFO uses only partial information of combinations that pilots can provide, which stems from a rigorous assumption that the channel responses of pilots used for estimating the IFO change very rapidly. Thus, in this paper, we propose the novel IFO estimation schemes exploiting all information of combinations that pilots can provide to improve the performance of IFO estimation. The simulation results show that the proposed schemes are highly accurate in terms of the IFO detection probability.

Estimation of Attenuation and Phase Delay in Driving Voltage Waveform of a Digital-Noiseless, Ultra-High-Speed Image Sensor

Since 2004, we have been developing an in-situ storage image sensor (ISIS) that captures more than 100 consecutive images at a frame rate of 10 Mfps with ultra-high sensitivity as well as the video camera for use with this ISIS. Currently, basic research is continuing in an attempt to increase the frame rate up to 100 Mfps and above. In order to suppress electro-magnetic noise at such high frequency, a digital-noiseless imaging transfer scheme has been developed utilizing solely sinusoidal driving voltages. This paper presents highly efficient-yet-accurate expressions to estimate attenuation as well as phase delay of driving voltages through RC networks of an ultra-high-speed image sensor. Elmore metric for a fundamental RC chain is employed as the first-order approximation. By application of dimensional analysis to SPICE data, we found a simple expression that significantly improves the accuracy of the approximation. Similarly, another simple closed-form model to estimate phase delay through fundamental RC networks is also obtained. Estimation error of both expressions is much less than previous works, only less 2% for most of the cases . The framework of this analysis can be extended to address similar issues of other VLSI structures.

A Data Hiding Model with High Security Features Combining Finite State Machines and PMM method

Recent years have witnessed the rapid development of the Internet and telecommunication techniques. Information security is becoming more and more important. Applications such as covert communication, copyright protection, etc, stimulate the research of information hiding techniques. Traditionally, encryption is used to realize the communication security. However, important information is not protected once decoded. Steganography is the art and science of communicating in a way which hides the existence of the communication. Important information is firstly hidden in a host data, such as digital image, video or audio, etc, and then transmitted secretly to the receiver.In this paper a data hiding model with high security features combining both cryptography using finite state sequential machine and image based steganography technique for communicating information more securely between two locations is proposed. The authors incorporated the idea of secret key for authentication at both ends in order to achieve high level of security. Before the embedding operation the secret information has been encrypted with the help of finite-state sequential machine and segmented in different parts. The cover image is also segmented in different objects through normalized cut.Each part of the encoded secret information has been embedded with the help of a novel image steganographic method (PMM) on different cuts of the cover image to form different stego objects. Finally stego image is formed by combining different stego objects and transmit to the receiver side. At the receiving end different opposite processes should run to get the back the original secret message.

Adaptive Group of Pictures Structure Based On the Positions of Video Cuts

In this paper we propose a method which improves the efficiency of video coding. Our method combines an adaptive GOP (group of pictures) structure and the shot cut detection. We have analyzed different approaches for shot cut detection with aim to choose the most appropriate one. The next step is to situate N frames to the positions of detected cuts during the process of video encoding. Finally the efficiency of the proposed method is confirmed by simulations and the obtained results are compared with fixed GOP structures of sizes 4, 8, 12, 16, 32, 64, 128 and GOP structure with length of entire video. Proposed method achieved the gain in bit rate from 0.37% to 50.59%, while providing PSNR (Peak Signal-to-Noise Ratio) gain from 1.33% to 0.26% in comparison to simulated fixed GOP structures.