Abstract: This paper presents an evaluation for a wavelet-based
digital watermarking technique used in estimating the quality of
video sequences transmitted over Additive White Gaussian Noise
(AWGN) channel in terms of a classical objective metric, such as
Peak Signal-to-Noise Ratio (PSNR) without the need of the original
video. In this method, a watermark is embedded into the Discrete
Wavelet Transform (DWT) domain of the original video frames
using a quantization method. The degradation of the extracted
watermark can be used to estimate the video quality in terms of
PSNR with good accuracy. We calculated PSNR for video frames
contaminated with AWGN and compared the values with those
estimated using the Watermarking-DWT based approach. It is found
that the calculated and estimated quality measures of the video
frames are highly correlated, suggesting that this method can provide
a good quality measure for video frames transmitted over AWGN
channel without the need of the original video.
Abstract: In H.264/AVC video encoding, rate-distortion
optimization for mode selection plays a significant role to achieve
outstanding performance in compression efficiency and video quality.
However, this mode selection process also makes the encoding
process extremely complex, especially in the computation of the ratedistortion
cost function, which includes the computations of the sum
of squared difference (SSD) between the original and reconstructed
image blocks and context-based entropy coding of the block. In this
paper, a transform-domain rate-distortion optimization accelerator
based on fast SSD (FSSD) and VLC-based rate estimation algorithm
is proposed. This algorithm could significantly simplify the hardware
architecture for the rate-distortion cost computation with only
ignorable performance degradation. An efficient hardware structure
for implementing the proposed transform-domain rate-distortion
optimization accelerator is also proposed. Simulation results
demonstrated that the proposed algorithm reduces about 47% of total
encoding time with negligible degradation of coding performance.
The proposed method can be easily applied to many mobile video
application areas such as a digital camera and a DMB (Digital
Multimedia Broadcasting) phone.
Abstract: This study proposes a new recommender system based on the collaborative folksonomy. The purpose of the proposed system is to recommend Internet resources (such as books, articles, documents, pictures, audio and video) to users. The proposed method includes four steps: creating the user profile based on the tags, grouping the similar users into clusters using an agglomerative hierarchical clustering, finding similar resources based on the user-s past collections by using content-based filtering, and recommending similar items to the target user. This study examines the system-s performance for the dataset collected from “del.icio.us," which is a famous social bookmarking website. Experimental results show that the proposed tag-based collaborative and content-based filtering hybridized recommender system is promising and effectiveness in the folksonomy-based bookmarking website.
Abstract: Packet switched data network like Internet, which has
traditionally supported throughput sensitive applications such as email
and file transfer, is increasingly supporting delay-sensitive
multimedia applications such as interactive video. These delaysensitive
applications would often rather sacrifice some throughput
for better delay. Unfortunately, the current packet switched network
does not offer choices, but instead provides monolithic best-effort
service to all applications. This paper evaluates Class Based Queuing
(CBQ), Coordinated Earliest Deadline First (CEDF), Weighted
Switch Deficit Round Robin (WSDRR) and RED-Boston scheduling
schemes that is sensitive to delay bound expectations for variety of
real time applications and an enhancement of WSDRR is proposed.
Abstract: Shot boundary detection is a fundamental step for the organization of large video data. In this paper, we propose a new method for video gradual shots detection and classification, using advantages of fractal analysis and AIS-based classifier. Proposed features are “vertical intercept" and “fractal dimension" of each frame of videos which are computed using Fourier transform coefficients. We also used a classifier based on Clonal Selection Algorithm. We have carried out our solution and assessed it according to the TRECVID2006 benchmark dataset.
Abstract: Camera calibration plays an important role in the domain of the analysis of sports video. Considering soccer video, in most cases, the cross-points can be used for calibration at the center of the soccer field are not sufficient, so this paper introduces a new automatic camera calibration algorithm focus on solving this problem by using the properties of images of the center circle, halfway line and a touch line. After the theoretical analysis, a practicable automatic algorithm is proposed. Very little information used though, results of experiments with both synthetic data and real data show that the algorithm is applicable.
Abstract: Recognizing human action from videos is an active
field of research in computer vision and pattern recognition. Human
activity recognition has many potential applications such as video
surveillance, human machine interaction, sport videos retrieval and
robot navigation. Actually, local descriptors and bag of visuals words
models achieve state-of-the-art performance for human action
recognition. The main challenge in features description is how to
represent efficiently the local motion information. Most of the
previous works focus on the extension of 2D local descriptors on 3D
ones to describe local information around every interest point. In this
paper, we propose a new spatio-temporal descriptor based on a spacetime
description of moving points. Our description is focused on an
Accordion representation of video which is well-suited to recognize
human action from 2D local descriptors without the need to 3D
extensions. We use the bag of words approach to represent videos.
We quantify 2D local descriptor describing both temporal and spatial
features with a good compromise between computational complexity
and action recognition rates. We have reached impressive results on
publicly available action data set
Abstract: This paper presents a narrative management system
for organizations to capture organization's tacit knowledge
through stories. The intention of capturing tacit knowledge is to
address the problem that comes with the mobility of workforce in
organisation. Storytelling in knowledge management context is
seen as a powerful management tool to communicate tacit
knowledge in organization. This narrative management system is
developed firstly to enable uploading of many types of knowledge
sharing stories, from general to work related-specific stories and
secondly, each video has comment functionality where knowledge
users can post comments to other knowledge users. The narrative
management system allows the stories to browse, search and view
by the users. In the system, stories are stored in a video repository.
Stories that were produced from this framework will improve
learning, knowledge transfer facilitation and tacit knowledge
quality in an organization.
Abstract: Detection of player identity is challenging task in sport video content analysis. In case of soccer video player number recognition is effective and precise solution. Jersey numbers can be considered as scene text and difficulties in localization and recognition appear due to variations in orientation, size, illumination, motion etc. This paper proposed new method for player number localization and recognition. By observing hue, saturation and value for 50 different jersey examples we noticed that most often combination of low and high saturated pixels is used to separate number and jersey region. Image segmentation method based on this observation is introduced. Then, novel method for player number localization based on internal contours is proposed. False number candidates are filtered using area and aspect ratio. Before OCR processing extracted numbers are enhanced using image smoothing and rotation normalization.
Abstract: In this paper we propose a method for recognition of
adult video based on support vector machine (SVM). Different kernel
features are proposed to classify adult videos. SVM has an advantage
that it is insensitive to the relative number of training example in
positive (adult video) and negative (non adult video) classes. This
advantage is illustrated by comparing performance between different
SVM kernels for the identification of adult video.
Abstract: One very interesting field of research in Pattern Recognition that has gained much attention in recent times is Gesture Recognition. In this paper, we consider a form of dynamic hand gestures that are characterized by total movement of the hand (arm) in space. For these types of gestures, the shape of the hand (palm) during gesturing does not bear any significance. In our work, we propose a model-based method for tracking hand motion in space, thereby estimating the hand motion trajectory. We employ the dynamic time warping (DTW) algorithm for time alignment and normalization of spatio-temporal variations that exist among samples belonging to the same gesture class. During training, one template trajectory and one prototype feature vector are generated for every gesture class. Features used in our work include some static and dynamic motion trajectory features. Recognition is accomplished in two stages. In the first stage, all unlikely gesture classes are eliminated by comparing the input gesture trajectory to all the template trajectories. In the next stage, feature vector extracted from the input gesture is compared to all the class prototype feature vectors using a distance classifier. Experimental results demonstrate that our proposed trajectory estimator and classifier is suitable for Human Computer Interaction (HCI) platform.
Abstract: In this paper, we proposed the robust mobile object
detection method for light effect in the night street image block based
updating reference background model using block state analysis.
Experiment image is acquired sequence color video from steady
camera. When suddenly appeared artificial illumination, reference
background model update this information such as street light, sign
light. Generally natural illumination is change by temporal, but
artificial illumination is suddenly appearance. So in this paper for
exactly detect artificial illumination have 2 state process. First process
is compare difference between current image and reference
background by block based, it can know changed blocks. Second
process is difference between current image-s edge map and reference
background image-s edge map, it possible to estimate illumination at
any block. This information is possible to exactly detect object,
artificial illumination and it was generating reference background
more clearly. Block is classified by block-state analysis. Block-state
has a 4 state (i.e. transient, stationary, background, artificial
illumination). Fig. 1 is show characteristic of block-state respectively
[1]. Experimental results show that the presented approach works well
in the presence of illumination variance.
Abstract: Many contemporary telemedical applications rely on
regular consultations over the phone or video conferencing which
consumes valuable resources such as the time of the doctors. Some
applications or treatments allow automated diagnostics on the patient
side which only notifies the doctors in case a significant worsening
of patient’s condition is measured.
Such programs can save valuable resources but an important
implementation issue is how to ensure effective and cheap diagnostics
on the patient side. First, specific diagnostic devices on patient side
are expensive and second, they need to be user-˜friendly to encourage
patient’s cooperation and reduce errors in usage which may cause
noise in diagnostic data.
This article proposes the use of modern smartphones and various
build-in or attachable sensors as universal diagnostic devices applicable
in a wider range of telemedical programs and demonstrates their
application on a case-study – a program for schizophrenic relapse
prevention.
Abstract: The emergence of the Internet has brewed the
revolution of information storage and retrieval. As most of the
data in the web is unstructured, and contains a mix of text,
video, audio etc, there is a need to mine information to cater to
the specific needs of the users without loss of important
hidden information. Thus developing user friendly and
automated tools for providing relevant information quickly
becomes a major challenge in web mining research. Most of
the existing web mining algorithms have concentrated on
finding frequent patterns while neglecting the less frequent
ones that are likely to contain outlying data such as noise,
irrelevant and redundant data. This paper mainly focuses on
Signed approach and full word matching on the organized
domain dictionary for mining web content outliers. This
Signed approach gives the relevant web documents as well as
outlying web documents. As the dictionary is organized based
on the number of characters in a word, searching and retrieval
of documents takes less time and less space.
Abstract: In order to achieve better road utilization and traffic
efficiency, there is an urgent need for a travel information delivery
mechanism to assist the drivers in making better decisions in the
emerging intelligent transportation system applications. In this paper,
we propose a relayed multicast scheme under heterogeneous networks
for this purpose. In the proposed system, travel information consisting
of summarized traffic conditions, important events, real-time traffic
videos, and local information service contents is formed into layers
and multicasted through an integration of WiMAX infrastructure and
Vehicular Ad hoc Networks (VANET). By the support of adaptive
modulation and coding in WiMAX, the radio resources can be
optimally allocated when performing multicast so as to dynamically
adjust the number of data layers received by the users. In addition to
multicast supported by WiMAX, a knowledge propagation and
information relay scheme by VANET is designed. The experimental
results validate the feasibility and effectiveness of the proposed
scheme.
Abstract: In this paper we present a novel approach for face image coding. The proposed method makes a use of the features of video encoders like motion prediction. At first encoder selects appropriate prototype from the database and warps it according to features of encoding face. Warped prototype is placed as first I frame. Encoding face is placed as second frame as P frame type. Information about features positions, color change, selected prototype and data flow of P frame will be sent to decoder. The condition is both encoder and decoder own the same database of prototypes. We have run experiment with H.264 video encoder and obtained results were compared to results achieved by JPEG and JPEG2000. Obtained results show that our approach is able to achieve 3 times lower bitrate and two times higher PSNR in comparison with JPEG. According to comparison with JPEG2000 the bitrate was very similar, but subjective quality achieved by proposed method is better.
Abstract: In this paper we proposed comparison of four content based objective metrics with results of subjective tests from 80 video sequences. We also include two objective metrics VQM and SSIM to our comparison to serve as “reference” objective metrics because their pros and cons have already been published. Each of the video sequence was preprocessed by the region recognition algorithm and then the particular objective video quality metric were calculated i.e. mutual information, angular distance, moment of angle and normalized cross-correlation measure. The Pearson coefficient was calculated to express metrics relationship to accuracy of the model and the Spearman rank order correlation coefficient to represent the metrics relationship to monotonicity. The results show that model with the mutual information as objective metric provides best result and it is suitable for evaluating quality of video sequences.
Abstract: Current technological advances pale in comparison to the changes in social behaviors and 'sense of place' that is being empowered since the Internet made it on the scene. Today-s students view the Internet as both a source of entertainment and an educational tool. The development of virtual environments is a conceptual framework that needs to be addressed by educators and it is important that they become familiar with who these virtual learners are and how they are motivated to learn. Massively multiplayer online role playing games (MMORPGs), if well designed, could become the vehicle of choice to deliver learning content. We suggest that these games, in order to accomplish these goals, must begin with well-established instructional design principles that are co-aligned with established principles of video game design. And have the opportunity to provide an instructional model of significant prescriptive power. The authors believe that game designers need to take advantage of the natural motivation player-learners have for playing games by developing them in such a way so as to promote, intrinsic motivation, content learning, transfer of knowledge, and naturalization.
Abstract: Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.
Abstract: In this paper we present a system for classifying videos
by frequency spectra. Many videos contain activities with repeating
movements. Sports videos, home improvement videos, or videos
showing mechanical motion are some example areas. Motion of these
areas usually repeats with a certain main frequency and several side
frequencies. Transforming repeating motion to its frequency domain
via FFT reveals these frequencies. Average amplitudes of frequency
intervals can be seen as features of cyclic motion. Hence determining
these features can help to classify videos with repeating movements.
In this paper we explain how to compute frequency spectra for video
clips and how to use them for classifying. Our approach utilizes series
of image moments as a function. This function again is transformed
into its frequency domain.