Abstract: Efficient storage, transmission and use of video information are key requirements in many multimedia applications currently being addressed by MPEG-4. To fulfill these requirements, a new approach for representing video information which relies on an object-based representation, has been adopted. Therefore, objectbased watermarking schemes are needed for copyright protection. This paper proposes a novel blind object watermarking scheme for images and video using the in place lifting shape adaptive-discrete wavelet transform (SA-DWT). In order to make the watermark robust and transparent, the watermark is embedded in the average of wavelet blocks using the visual model based on the human visual system. Wavelet coefficients n least significant bits (LSBs) are adjusted in concert with the average. Simulation results shows that the proposed watermarking scheme is perceptually invisible and robust against many attacks such as lossy image/video compression (e.g. JPEG, JPEG2000 and MPEG-4), scaling, adding noise, filtering, etc.
Abstract: The use of human hand as a natural interface for humancomputer interaction (HCI) serves as the motivation for research in hand gesture recognition. Vision-based hand gesture recognition involves visual analysis of hand shape, position and/or movement. In this paper, we use the concept of object-based video abstraction for segmenting the frames into video object planes (VOPs), as used in MPEG-4, with each VOP corresponding to one semantically meaningful hand position. Next, the key VOPs are selected on the basis of the amount of change in hand shape – for a given key frame in the sequence the next key frame is the one in which the hand changes its shape significantly. Thus, an entire video clip is transformed into a small number of representative frames that are sufficient to represent a gesture sequence. Subsequently, we model a particular gesture as a sequence of key frames each bearing information about its duration. These constitute a finite state machine. For recognition, the states of the incoming gesture sequence are matched with the states of all different FSMs contained in the database of gesture vocabulary. The core idea of our proposed representation is that redundant frames of the gesture video sequence bear only the temporal information of a gesture and hence discarded for computational efficiency. Experimental results obtained demonstrate the effectiveness of our proposed scheme for key frame extraction, subsequent gesture summarization and finally gesture recognition.
Abstract: This paper will present the initial findings of a
research into distributed computer rendering. The goal of the
research is to create a distributed computer system capable of
rendering a 3D model into an MPEG-4 stream. This paper outlines
the initial design, software architecture and hardware setup for the
system.
Distributed computing means designing and implementing
programs that run on two or more interconnected computing systems.
Distributed computing is often used to speed up the rendering of
graphical imaging. Distributed computing systems are used to
generate images for movies, games and simulations.
A topic of interest is the application of distributed computing to
the MPEG-4 standard. During the course of the research, a
distributed system will be created that can render a 3D model into an
MPEG-4 stream. It is expected that applying distributed computing
principals will speed up rendering, thus improving the usefulness and
efficiency of the MPEG-4 standard
Abstract: A talking head system (THS) is presented to animate
the face of a speaking 3D avatar in such a way that it realistically
pronounces the given Korean text. The proposed system consists of
SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant
face animation generator. The input to the THS is a unicode text that is
to be spoken with synchronized lip shape. The TTS engine generates a
phoneme sequence with their duration and audio data. The TTS
applies the coarticulation rules to the phoneme sequence and sends a
mouth animation sequence to the face modeler. The proposed THS can
make more natural lip sync and facial expression by using the face
animation generator than those using the conventional visemes only.
The experimental results show that our system has great potential for
the implementation of talking head for Korean text.