Abstract: A robust still image face localization algorithm
capable of operating in an unconstrained visual environment is
proposed. First, construction of a robust skin classifier within a
shifted HSV color space is described. Then various filtering
operations are performed to better isolate face candidates and
mitigate the effect of substantial non-skin regions. Finally, a novel
Bhattacharyya-based face detection algorithm is used to compare
candidate regions of interest with a unique illumination-dependent
face model probability distribution function approximation.
Experimental results show a 90% face detection success rate despite
the demands of the visually noisy environment.
Abstract: The practical implementation of audio-video coupled speech recognition systems is mainly limited by the hardware complexity to integrate two radically different information capturing devices with good temporal synchronisation. In this paper, we propose a solution based on a smart CMOS image sensor in order to simplify the hardware integration difficulties. By using on-chip image processing, this smart sensor can calculate in real time the X/Y projections of the captured image. This on-chip projection reduces considerably the volume of the output data. This data-volume reduction permits a transmission of the condensed visual information via the same audio channel by using a stereophonic input available on most of the standard computation devices such as PC, PDA and mobile phones. A prototype called VMIKE (Visio-Microphone) has been designed and realised by using standard 0.35um CMOS technology. A preliminary experiment gives encouraged results. Its efficiency will be further investigated in a large variety of applications such as biometrics, speech recognition in noisy environments, and vocal control for military or disabled persons, etc.
Abstract: Detection and tracking of the lip contour is an important
issue in speechreading. While there are solutions for lip tracking
once a good contour initialization in the first frame is available,
the problem of finding such a good initialization is not yet solved
automatically, but done manually. We have developed a new tracking
solution for lip contour detection using only few landmarks (15
to 25) and applying the well known Active Shape Models (ASM).
The proposed method is a new LMS-like adaptive scheme based on
an Auto regressive (AR) model that has been fit on the landmark
variations in successive video frames. Moreover, we propose an extra
motion compensation model to address more general cases in lip
tracking. Computer simulations demonstrate a fair match between
the true and the estimated spatial pixels. Significant improvements
related to the well known LMS approach has been obtained via a
defined Frobenius norm index.