Abstract: In this paper, a new approach for target recognition based on the Empirical mode decomposition (EMD) algorithm of Huang etal. [11] and the energy tracking operator of Teager [13]-[14] is introduced. The conjunction of these two methods is called Teager-Huang analysis. This approach is well suited for nonstationary signals analysis. The impulse response (IR) of target is first band pass filtered into subsignals (components) called Intrinsic mode functions (IMFs) with well defined Instantaneous frequency (IF) and Instantaneous amplitude (IA). Each IMF is a zero-mean AM-FM component. In second step, the energy of each IMF is tracked using the Teager energy operator (TEO). IF and IA, useful to describe the time-varying characteristics of the signal, are estimated using the Energy separation algorithm (ESA) algorithm of Maragos et al .[16]-[17]. In third step, a set of features such as skewness and kurtosis are extracted from the IF, IA and IMF energy functions. The Teager-Huang analysis is tested on set of synthetic IRs of Sonar targets with different physical characteristics (density, velocity, shape,? ). PCA is first applied to features to discriminate between manufactured and natural targets. The manufactured patterns are classified into spheres and cylinders. One hundred percent of correct recognition is achieved with twenty three echoes where sixteen IRs, used for training, are free noise and seven IRs, used for testing phase, are corrupted with white Gaussian noise.
Abstract: In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.
Abstract: This Paper proposes a new facial feature extraction approach, Wash-Hadamard Transform (WHT). This approach is based on correlation between local pixels of the face image. Its primary advantage is the simplicity of its computation. The paper compares the proposed approach, WHT, which was traditionally used in data compression with two other known approaches: the Principal Component Analysis (PCA) and the Discrete Cosine Transform (DCT) using the face database of Olivetti Research Laboratory (ORL). In spite of its simple computation, the proposed algorithm (WHT) gave very close results to those obtained by the PCA and DCT. This paper initiates the research into WHT and the family of frequency transforms and examines their suitability for feature extraction in face recognition applications.
Abstract: Intelligent Video-Surveillance (IVS) systems are
being more and more popular in security applications. The analysis
and recognition of abnormal behaviours in a video sequence has
gradually drawn the attention in the field of IVS, since it allows
filtering out a large number of useless information, which guarantees
the high efficiency in the security protection, and save a lot of human
and material resources. We present in this paper ADABeV, an
intelligent video-surveillance framework for event recognition in
crowded scene to detect the abnormal human behaviour. This
framework is attended to be able to achieve real-time alarming,
reducing the lags in traditional monitoring systems. This architecture
proposal addresses four main challenges: behaviour understanding in
crowded scenes, hard lighting conditions, multiple input kinds of
sensors and contextual-based adaptability to recognize the active
context of the scene.
Abstract: Different methods containing biometric algorithms are
presented for the representation of eigenfaces detection including
face recognition, are identification and verification. Our theme of this
research is to manage the critical processing stages (accuracy, speed,
security and monitoring) of face activities with the flexibility of
searching and edit the secure authorized database. In this paper we
implement different techniques such as eigenfaces vector reduction
by using texture and shape vector phenomenon for complexity
removal, while density matching score with Face Boundary Fixation
(FBF) extracted the most likelihood characteristics in this media
processing contents. We examine the development and performance
efficiency of the database by applying our creative algorithms in both
recognition and detection phenomenon. Our results show the
performance accuracy and security gain with better achievement than
a number of previous approaches in all the above processes in an
encouraging mode.
Abstract: Instead of representing individual cognition only, population cognition is represented using artificial neural networks whilst maintaining individuality. This population network trains continuously, simulating adaptation. An implementation of two coexisting populations is compared to the Lotka-Volterra model of predator-prey interaction. Applications include multi-agent systems such as artificial life or computer games.
Abstract: Pattern recognition is the research area of Artificial Intelligence that studies the operation and design of systems that recognize patterns in the data. Important application areas are image analysis, character recognition, fingerprint classification, speech analysis, DNA sequence identification, man and machine diagnostics, person identification and industrial inspection. The interest in improving the classification systems of data analysis is independent from the context of applications. In fact, in many studies it is often the case to have to recognize and to distinguish groups of various objects, which requires the need for valid instruments capable to perform this task. The objective of this article is to show several methodologies of Artificial Intelligence for data classification applied to biomedical patterns. In particular, this work deals with the realization of a Computer-Aided Detection system (CADe) that is able to assist the radiologist in identifying types of mammary tumor lesions. As an additional biomedical application of the classification systems, we present a study conducted on blood samples which shows how these methods may help to distinguish between carriers of Thalassemia (or Mediterranean Anaemia) and healthy subjects.
Abstract: The paper discusses the mathematics of pattern
indexing and its applications to recognition of visual patterns that are
found in video clips. It is shown that (a) pattern indexes can be
represented by collections of inverted patterns, (b) solutions to
pattern classification problems can be found as intersections and
histograms of inverted patterns and, thus, matching of original
patterns avoided.
Abstract: Nowadays, doping is an intricate dilemma. Wrestling
is the nationally popular sport in Iran. Also the prevalence of doping
may be high, due to its power demanding characteristics. So, we
aimed to assess the knowledge and attitudes toward doping among
the club wrestlers. In a cross sectional study, 426 wrestlers were
studied. For this reason, a researcher made questionnaire was used. In
this study, researchers selected the clubs by randomized clustered
sampling and distributed the questionnaire among wrestlers.
Knowledge of wrestlers in three categories of doping definitions,
recognition of prohibited drugs and side effects was poor or moderate
in 70.8%, 95.8% and 99.5%, respectively. Wrestlers have poor
knowledge in doping. Furthermore, they believe some myths which
are unfavorable. It seems necessary to design a comprehensive
educational program for all of the athletes and coaches.
Abstract: One of the main image representations in Mathematical Morphology is the 3D Shape Decomposition Representation, useful for Image Compression and Representation,and Pattern Recognition. The 3D Morphological Shape Decomposition representation can be generalized a number of times,to extend the scope of its algebraic characteristics as much as possible. With these generalizations, the Morphological Shape Decomposition 's role to serve as an efficient image decomposition tool is extended to grayscale images.This work follows the above line, and further develops it. Anew evolutionary branch is added to the 3D Morphological Shape Decomposition's development, by the introduction of a 3D Multi Structuring Element Morphological Shape Decomposition, which permits 3D Morphological Shape Decomposition of 3D binary images (grayscale images) into "multiparameter" families of elements. At the beginning, 3D Morphological Shape Decomposition representations are based only on "1 parameter" families of elements for image decomposition.This paper addresses the gray scale inter frame interpolation by means of mathematical morphology. The new interframe interpolation method is based on generalized morphological 3D Shape Decomposition. This article will present the theoretical background of the morphological interframe interpolation, deduce the new representation and show some application examples.Computer simulations could illustrate results.
Abstract: The paper presents an on-line recognition machine
(RM) for continuous/isolated, dynamic and static gestures that arise
in Flight Deck Officer (FDO) training. RM is based on generic pattern
recognition framework. Gestures are represented as templates using
summary statistics. The proposed recognition algorithm exploits temporal
and spatial characteristics of gestures via dynamic programming
and Markovian process. The algorithm predicts corresponding index
of incremental input data in the templates in an on-line mode.
Accumulated consistency in the sequence of prediction provides a
similarity measurement (Score) between input data and the templates.
The algorithm provides an intuitive mechanism for automatic detection
of start/end frames of continuous gestures. In the present paper,
we consider isolated gestures. The performance of RM is evaluated
using four datasets - artificial (W TTest), hand motion (Yang) and
FDO (tracker, vision-based ). RM achieves comparable results which
are in agreement with other on-line and off-line algorithms such as
hidden Markov model (HMM) and dynamic time warping (DTW).
The proposed algorithm has the additional advantage of providing
timely feedback for training purposes.
Abstract: In this paper, we were introduces a skin detection
method using a histogram approximation based on the mean shift
algorithm. The proposed method applies the mean shift procedure to a
histogram of a skin map of the input image, generated by comparison
with standard skin colors in the CbCr color space, and divides the
background from the skin region by selecting the maximum value
according to brightness level. The proposed method detects the skin
region using the mean shift procedure to determine a maximum value
that becomes the dividing point, rather than using a manually selected
threshold value, as in existing techniques. Even when skin color is
contaminated by illumination, the procedure can accurately segment
the skin region and the background region. The proposed method may
be useful in detecting facial regions as a pretreatment for face
recognition in various types of illumination.
Abstract: Film, as an art form playing a vital role and is a powerful tool in documenting, influencing and shaping the society. Films are the collective creation of a large number of separate individuals, each contributing with creative input, unique talents, and technical expertise to the project. Recently, the Malaysian Independent (or “Indie") filmmakers have made their presence felt by winning awards at various international film festivals. Working in the digital video (DV) format, a number of independent filmmakers really hit their stride with a range of remarkably strong titles and international recognition has been quick in coming and their works are now regularly in exhibition or in competition, winning many top prizes at prestigious festivals around the world. The interaction factors among crewmembers are emphasized as imperative for group success. An in-depth interview is conducted to analyze the social interactions and exchanges between filmmakers through Social Exchanges Theory (SET). Certainly the new millennium that was marked as the digital technology revolution has changed the face of filmmaking in Malaysia. There is a clear need to study the Malaysian independent cinema especially from the perspective of understanding what causes the independent filmmakers to work so well given all of the difficulties and constraints.
Abstract: Local Linear Neuro-Fuzzy Models (LLNFM) like other neuro- fuzzy systems are adaptive networks and provide robust learning capabilities and are widely utilized in various applications such as pattern recognition, system identification, image processing and prediction. Local linear model tree (LOLIMOT) is a type of Takagi-Sugeno-Kang neuro fuzzy algorithm which has proven its efficiency compared with other neuro fuzzy networks in learning the nonlinear systems and pattern recognition. In this paper, a dedicated reconfigurable and parallel processing hardware for LOLIMOT algorithm and its applications are presented. This hardware realizes on-chip learning which gives it the capability to work as a standalone device in a system. The synthesis results on FPGA platforms show its potential to improve the speed at least 250 of times faster than software implemented algorithms.
Abstract: Automatic reading of handwritten cheque is a computationally
complex process and it plays an important role in financial
risk management. Machine vision and learning provide a viable
solution to this problem. Research effort has mostly been focused
on recognizing diverse pitches of cheques and demand drafts with an
identical outline. However most of these methods employ templatematching
to localize the pitches and such schemes could potentially
fail when applied to different types of outline maintained by the
bank. In this paper, the so-called outline problem is resolved by
a cheque information tree (CIT), which generalizes the localizing
method to extract active-region-of-entities. In addition, the weight
based density plot (WBDP) is performed to isolate text entities and
read complete pitches. Recognition is based on texture features using
neural classifiers. Legal amount is subsequently recognized by both
texture and perceptual features. A post-processing phase is invoked
to detect the incorrect readings by Type-2 grammar using the Turing
machine. The performance of the proposed system was evaluated
using cheque and demand drafts of 22 different banks. The test data
consists of a collection of 1540 leafs obtained from 10 different
account holders from each bank. Results show that this approach
can easily be deployed without significant design amendments.
Abstract: In this paper, a novel method for recognition of musical
instruments in a polyphonic music is presented by using an
embedded hidden Markov model (EHMM). EHMM is a doubly
embedded HMM structure where each state of the external HMM
is an independent HMM. The classification is accomplished for
two different internal HMM structures where GMMs are used as
likelihood estimators for the internal HMMs. The results are compared
to those achieved by an artificial neural network with two
hidden layers. Appropriate classification accuracies were achieved
both for solo instrument performance and instrument combinations
which demonstrates that the new approach outperforms the similar
classification methods by means of the dynamic of the signal.
Abstract: Advancement in Artificial Intelligence has lead to the
developments of various “smart" devices. Character recognition
device is one of such smart devices that acquire partial human
intelligence with the ability to capture and recognize various
characters in different languages. Firstly multiscale neural training
with modifications in the input training vectors is adopted in this
paper to acquire its advantage in training higher resolution character
images. Secondly selective thresholding using minimum distance
technique is proposed to be used to increase the level of accuracy of
character recognition. A simulator program (a GUI) is designed in
such a way that the characters can be located on any spot on the
blank paper in which the characters are written. The results show that
such methods with moderate level of training epochs can produce
accuracies of at least 85% and more for handwritten upper case
English characters and numerals.
Abstract: In digital signal processing it is important to
approximate multi-dimensional data by the method called rank
reduction, in which we reduce the rank of multi-dimensional data from
higher to lower. For 2-dimennsional data, singular value
decomposition (SVD) is one of the most known rank reduction
techniques. Additional, outer product expansion expanded from SVD
was proposed and implemented for multi-dimensional data, which has
been widely applied to image processing and pattern recognition.
However, the multi-dimensional outer product expansion has behavior
of great computation complex and has not orthogonally between the
expansion terms. Therefore we have proposed an alterative method,
Third-order Orthogonal Tensor Product Expansion short for 3-OTPE.
3-OTPE uses the power method instead of nonlinear optimization
method for decreasing at computing time. At the same time the group
of B. D. Lathauwer proposed Higher-Order SVD (HOSVD) that is
also developed with SVD extensions for multi-dimensional data.
3-OTPE and HOSVD are similarly on the rank reduction of
multi-dimensional data. Using these two methods we can obtain
computation results respectively, some ones are the same while some
ones are slight different. In this paper, we compare 3-OTPE to
HOSVD in accuracy of calculation and computing time of resolution,
and clarify the difference between these two methods.
Abstract: One major source of performance decline in speaker
recognition system is channel mismatch between training and testing.
This paper focuses on improving channel robustness of speaker
recognition system in two aspects of channel compensation technique
and channel robust features. The system is text-independent speaker
identification system based on two-stage recognition. In the aspect of
channel compensation technique, this paper applies MAP (Maximum
A Posterior Probability) channel compensation technique, which was
used in speech recognition, to speaker recognition system. In the
aspect of channel robust features, this paper introduces
pitch-dependent features and pitch-dependent speaker model for the
second stage recognition. Based on the first stage recognition to
testing speech using GMM (Gaussian Mixture Model), the system
uses GMM scores to decide if it needs to be recognized again. If it
needs to, the system selects a few speakers from all of the speakers
who participate in the first stage recognition for the second stage
recognition. For each selected speaker, the system obtains 3
pitch-dependent results from his pitch-dependent speaker model, and
then uses ANN (Artificial Neural Network) to unite the 3
pitch-dependent results and 1 GMM score for getting a fused result.
The system makes the second stage recognition based on these fused
results. The experiments show that the correct rate of two-stage
recognition system based on MAP channel compensation technique
and pitch-dependent features is 41.7% better than the baseline system
for closed-set test.
Abstract: Emotion in speech is an issue that has been attracting
the interest of the speech community for many years, both in the
context of speech synthesis as well as in automatic speech
recognition (ASR). In spite of the remarkable recent progress in
Large Vocabulary Recognition (LVR), it is still far behind the
ultimate goal of recognising free conversational speech uttered by
any speaker in any environment. Current experimental tests prove
that using state of the art large vocabulary recognition systems the
error rate increases substantially when applied to
spontaneous/emotional speech. This paper shows that recognition
rate for emotionally coloured speech can be improved by using a
language model based on increased representation of emotional
utterances.