Abstract: The development of wireless communication technologies has changed our living style in global level. After the international success of mobile telephony standards, the location and time independent voice connection has become a default method in daily telecommunications. As for today, highly advanced multimedia messaging plays a key role in value added service handling. Along with evolving data services, the need for more complex applications can be seen, including the mobile usage of broadcast technologies. Here performance of a system design for terrestrial multimedia content is examined with emphasis on mobile reception. This review paper has accommodated the understanding of physical layer role and the flavour of terrestrial channel effects on the terrestrial multimedia transmission using OFDM keeping DVB-H as benchmark standard.
Abstract: The image segmentation method described in this
paper has been developed as a pre-processing stage to be used in
methodologies and tools for video/image indexing and retrieval by
content. This method solves the problem of whole objects extraction
from background and it produces images of single complete objects
from videos or photos. The extracted images are used for calculating
the object visual features necessary for both indexing and retrieval
processes.
The segmentation algorithm is based on the cooperation among an
optical flow evaluation method, edge detection and region growing
procedures. The optical flow estimator belongs to the class of
differential methods. It permits to detect motions ranging from a
fraction of a pixel to a few pixels per frame, achieving good results in
presence of noise without the need of a filtering pre-processing stage
and includes a specialised model for moving object detection.
The first task of the presented method exploits the cues from
motion analysis for moving areas detection. Objects and background
are then refined using respectively edge detection and seeded region
growing procedures. All the tasks are iteratively performed until
objects and background are completely resolved.
The method has been applied to a variety of indoor and outdoor
scenes where objects of different type and shape are represented on
variously textured background.
Abstract: Super-resolution is nowadays used for a high-resolution
image produced from several low-resolution noisy frames. In
this work, we consider the problem of high-quality interpolation of a
single noise-free image. Such images may come from different sources,
i.e., they may be frames of videos, individual pictures, etc. On
the other hand, in the encoder we apply a downsampling via
bidimen-sional interpolation of each frame, and in the decoder we
apply a upsampling by which we restore the original size of the
image. If the compression ratio is very high, then we use a
convolutive mask that restores the edges, eliminating the blur.
Finally, both, the encoder and the complete decoder are implemented
on General-Purpose computation on Graphics Processing Units
(GPGPU) cards. In fact, the mentioned mask is coded inside texture
memory of a GPGPU.
Abstract: The refueling of a transparent rectangular fuel tank
fitted with a standard filler pipe and roll-over valve was
experimentally studied. A fuel-conditioning cart, capable of
handling fuels of different Reid vapor pressure at a constant
temperature, was used to dispense fuel at the desired rate. The
experimental protocol included transient recording of the tank and
filler tube pressures while video recording the flow patterns in the
filler tube and tank during the refueling process. This information
was used to determine the effect of changes in the vent tube
diameter, fuel-dispense flow rate and fuel Reid vapor pressure on the
pressure-time characteristics and the occurrence of premature fuel
filling shut-off and fuel spill-back. Pressure-time curves for the case
of normal shut-off demonstrated the classic, three-phase
characteristic noted in the literature. The variation of the maximum
values of tank dome and filler tube pressures are analyzed in relation
to the occurrence of premature shut-off.
Abstract: During the past several years, face recognition in video
has received significant attention. Not only the wide range of
commercial and law enforcement applications, but also the availability
of feasible technologies after several decades of research contributes
to the trend. Although current face recognition systems have reached a
certain level of maturity, their development is still limited by the
conditions brought about by many real applications. For example,
recognition images of video sequence acquired in an open
environment with changes in illumination and/or pose and/or facial
occlusion and/or low resolution of acquired image remains a largely
unsolved problem. In other words, current algorithms are yet to be
developed. This paper provides an up-to-date survey of video-based
face recognition research. To present a comprehensive survey, we
categorize existing video based recognition approaches and present
detailed descriptions of representative methods within each category.
In addition, relevant topics such as real time detection, real time
tracking for video, issues such as illumination, pose, 3D and low
resolution are covered.
Abstract: Because of the great advance in multimedia
technology, digital multimedia is vulnerable to malicious
manipulations. In this paper, a public key self-recovery block-based
video authentication technique is proposed which can not only
precisely localize the alteration detection but also recover the missing
data with high reliability. In the proposed block-based technique,
multiple description coding MDC is used to generate two codes (two
descriptions) for each block. Although one block code (one
description) is enough to rebuild the altered block, the altered block
is rebuilt with better quality by the two block descriptions. So using
MDC increases the ratability of recovering data. A block signature is
computed using a cryptographic hash function and a doubly linked
chain is utilized to embed the block signature copies and the block
descriptions into the LSBs of distant blocks and the block itself. The
doubly linked chain scheme gives the proposed technique the
capability to thwart vector quantization attacks. In our proposed
technique , anyone can check the authenticity of a given video using
the public key. The experimental results show that the proposed
technique is reliable for detecting, localizing and recovering the
alterations.
Abstract: This paper presents a implementation of an object tracking system in a video sequence. This object tracking is an important task in many vision applications. The main steps in video analysis are two: detection of interesting moving objects and tracking of such objects from frame to frame. In a similar vein, most tracking algorithms use pre-specified methods for preprocessing. In our work, we have implemented several object tracking algorithms (Meanshift, Camshift, Kalman filter) with different preprocessing methods. Then, we have evaluated the performance of these algorithms for different video sequences. The obtained results have shown good performances according to the degree of applicability and evaluation criteria.
Abstract: In recent years, the number of the cases of information
leaks is increasing. Companies and Research Institutions make various
actions against information thefts and security accidents. One of the
actions is adoption of the crime prevention system, including the
monitoring system by surveillance cameras. In order to solve
difficulties of multiple cameras monitoring, we develop the automatic
human tracking system using mobile agents through multiple
surveillance cameras to track target persons. In this paper, we develop
the monitor which confirms mobile agents tracing target persons, and
the simulator of video picture analysis to construct the tracking
algorithm.
Abstract: This paper presents an adaptive motion estimator
that can be dynamically reconfigured by the best algorithm
depending on the variation of the video nature during the lifetime
of an application under running. The 4 Step Search (4SS) and the
Gradient Search (GS) algorithms are integrated in the estimator in
order to be used in the case of rapid and slow video sequences
respectively. The Full Search Block Matching (FSBM) algorithm
has been also integrated in order to be used in the case of the
video sequences which are not real time oriented.
In order to efficiently reduce the computational cost while
achieving better visual quality with low cost power, the proposed
motion estimator is based on a Variable Block Size (VBS) scheme
that uses only the 16x16, 16x8, 8x16 and 8x8 modes.
Experimental results show that the adaptive motion estimator
allows better results in term of Peak Signal to Noise Ratio
(PSNR), computational cost, FPGA occupied area, and dissipated
power relatively to the most popular variable block size schemes
presented in the literature.
Abstract: In high bitrate information hiding techniques, 1 bit is
embedded within each 4 x 4 Discrete Cosine Transform (DCT)
coefficient block by means of vector quantization, then the hidden bit
can be effectively extracted in terminal end. In this paper high bitrate
information hiding algorithms are summarized, and the scheme of
video in video is implemented. Experimental result shows that the host
video which is embedded numerous auxiliary information have little
visually quality decline. Peak Signal to Noise Ratio (PSNR)Y of host
video only degrades 0.22dB in average, while the hidden information
has a high percentage of survives and keeps a high robustness in
H.264/AVC compression, the average Bit Error Rate(BER) of hiding
information is 0.015%.
Abstract: People detection from images has a variety of applications such as video surveillance and driver assistance system, but is still a challenging task and more difficult in crowded environments such as shopping malls in which occlusion of lower parts of human body often occurs. Lack of the full-body information requires more effective features than common features such as HOG. In this paper, new features are introduced that exploits global self-symmetry (GSS) characteristic in head-shoulder patterns. The features encode the similarity or difference of color histograms and oriented gradient histograms between two vertically symmetric blocks. The domain-specific features are rapid to compute from the integral images in Viola-Jones cascade-of-rejecters framework. The proposed features are evaluated with our own head-shoulder dataset that, in part, consists of a well-known INRIA pedestrian dataset. Experimental results show that the GSS features are effective in reduction of false alarmsmarginally and the gradient GSS features are preferred more often than the color GSS ones in the feature selection.
Abstract: This paper proposes and implements an core transform architecture, which is one of the major processes in HEVC video compression standard. The proposed core transform architecture is implemented with only adders and shifters instead of area-consuming multipliers. Shifters in the proposed core transform architecture are implemented in wires and multiplexers, which significantly reduces chip area. Also, it can process from 4×4 to 16×16 blocks with common hardware by reusing processing elements. Designed core transform architecture in 0.13um technology can process a 16×16 block with 2-D transform in 130 cycles, and its gate count is 101,015 gates.
Abstract: Efficient storage, transmission and use of video information are key requirements in many multimedia applications currently being addressed by MPEG-4. To fulfill these requirements, a new approach for representing video information which relies on an object-based representation, has been adopted. Therefore, objectbased watermarking schemes are needed for copyright protection. This paper proposes a novel blind object watermarking scheme for images and video using the in place lifting shape adaptive-discrete wavelet transform (SA-DWT). In order to make the watermark robust and transparent, the watermark is embedded in the average of wavelet blocks using the visual model based on the human visual system. Wavelet coefficients n least significant bits (LSBs) are adjusted in concert with the average. Simulation results shows that the proposed watermarking scheme is perceptually invisible and robust against many attacks such as lossy image/video compression (e.g. JPEG, JPEG2000 and MPEG-4), scaling, adding noise, filtering, etc.
Abstract: This paper presents an efficient VLSI architecture
design to achieve real time video processing using Full-Search Block
Matching (FSBM) algorithm. The design employs parallel bank
architecture with minimum latency, maximum throughput, and full
hardware utilization. We use nine parallel processors in our
architecture and each controlled by a state machine. State machine
control implementation makes the design very simple and cost
effective. The design is implemented using VHDL and the
programming techniques we incorporated makes the design
completely programmable in the sense that the search ranges and the
block sizes can be varied to suit any given requirements. The design
can operate at frequencies up to 36 MHz and it can function in QCIF
and CIF video resolution at 1.46 MHz and 5.86 MHz, respectively.
Abstract: Does a communication modality matter in delivering e-learning information? With the recent growth of broadcasting systems, media technologies and e-learning contents, various systems with different communication modalities have been introduced. In accordance with these trends, this study examines the effects of the information delivery modality on psychology of students. Findings from an experiment indicated that the delivering information which includes a video modality elicited higher degrees of credibility, quality, representativeness of content, and perceived suitability for delivering information than those of auditory information. However, there is no difference between content liking and attitude. The Implications of the findings and the limitations are discussed.
Abstract: Digital Video Terrestrial Broadcasting (DVB-T)
allows combining broadcasting, telephone and data services in one
network. It has facilitated mobile TV broadcasting. Mobile TV
broadcasting is dominated by fragmentation of standards in use in
different continents. In Asia T-DMB and ISDB-T are used while
Europe uses mainly DVB-H and in USA it is MediaFLO. Issues of
royalty for developers of these different incompatible technologies,
investments made and differing local conditions shall make it
difficult to agree on a unified standard in a very near future. Despite
this shortcoming, mobile TV has shown very good market potential.
There are a number of challenges that still exist for regulators,
investors and technology developers but the future looks bright.
There is need for mobile telephone operators to cooperate with
content providers and those operating terrestrial digital broadcasting
infrastructure for mutual benefit.
Abstract: Discrete Cosine Transform (DCT) based transform coding is very popular in image, video and speech compression due to its good energy compaction and decorrelating properties. However, at low bit rates, the reconstructed images generally suffer from visually annoying blocking artifacts as a result of coarse quantization. Lapped transform was proposed as an alternative to the DCT with reduced blocking artifacts and increased coding gain. Lapped transforms are popular for their good performance, robustness against oversmoothing and availability of fast implementation algorithms. However, there is no proper study reported in the literature regarding the statistical distributions of block Lapped Orthogonal Transform (LOT) and Lapped Biorthogonal Transform (LBT) coefficients. This study performs two goodness-of-fit tests, the Kolmogorov-Smirnov (KS) test and the 2- test, to determine the distribution that best fits the LOT and LBT coefficients. The experimental results show that the distribution of a majority of the significant AC coefficients can be modeled by the Generalized Gaussian distribution. The knowledge of the statistical distribution of transform coefficients greatly helps in the design of optimal quantizers that may lead to minimum distortion and hence achieve optimal coding efficiency.
Abstract: Visual information is very important in human perception
of surrounding world. Video is one of the most common ways to
capture visual information. The video capability has many benefits
and can be used in various applications. For the most part, the
video information is used to bring entertainment and help to relax,
moreover, it can improve the quality of life of deaf people. Visual
information is crucial for hearing impaired people, it allows them to
communicate personally, using the sign language; some parts of the
person being spoken to, are more important than others (e.g. hands,
face). Therefore, the information about visually relevant parts of the
image, allows us to design objective metric for this specific case. In
this paper, we present an example of an objective metric based on
human visual attention and detection of salient object in the observed
scene.
Abstract: Motion detection is a basic operation in the selection of significant segments of the video signals. For an effective Human Computer Intelligent Interaction, the computer needs to recognize the motion and track the moving object. Here an efficient neural network system is proposed for motion detection from the static background. This method mainly consists of four parts like Frame Separation, Rough Motion Detection, Network Formation and Training, Object Tracking. This paper can be used to verify real time detections in such a way that it can be used in defense applications, bio-medical applications and robotics. This can also be used for obtaining detection information related to the size, location and direction of motion of moving objects for assessment purposes. The time taken for video tracking by this Neural Network is only few seconds.
Abstract: In this paper as showed a non-invasive 3D eye tracker
for optometry clinical applications. Measurements of biomechanical
variables in clinical practice have many font of errors associated with
traditional procedments such cover test (CT), near point of
accommodation (NPC), eye ductions (ED), eye vergences (EG) and,
eye versions (ES). Ocular motility should always be tested but all
evaluations have a subjective interpretations by practitioners, the
results is based in clinical experiences, repeatability and accuracy
don-t exist. Optometric-lab is a tool with 3 (tree) analogical video
cameras triggered and synchronized in one acquisition board AD.
The variables globe rotation angle and velocity can be quantified.
Data record frequency was performed with 27Hz, camera calibration
was performed in a know volume and image radial distortion
adjustments.