Abstract: In this paper, a novel system
recognition of human faces without using face
different color photographs is proposed. It mainly in
face detection, normalization and recognition. Foot
method of combination of Haar-like face determined
segmentation and region-based histogram stretchi
(RHST) is proposed to achieve more accurate perf
using Haar. Apart from an effective angle norm
side-face (pose) normalization, which is almost a might be important and beneficial for the prepr
introduced. Then histogram-based and photom
normalization methods are investigated and ada
retinex (ASR) is selected for its satisfactory illumin
Finally, weighted multi-block local binary pattern
with 3 distance measures is applied for pair-mat
Experimental results show its advantageous perfo
with PCA and multi-block LBP, based on a principle.
Abstract: A complex valued neural network is a neural network, which consists of complex valued input and/or weights and/or thresholds and/or activation functions. Complex-valued neural networks have been widening the scope of applications not only in electronics and informatics, but also in social systems. One of the most important applications of the complex valued neural network is in image and vision processing. In Neural networks, radial basis functions are often used for interpolation in multidimensional space. A Radial Basis function is a function, which has built into it a distance criterion with respect to a centre. Radial basis functions have often been applied in the area of neural networks where they may be used as a replacement for the sigmoid hidden layer transfer characteristic in multi-layer perceptron. This paper aims to present exhaustive results of using RBF units in a complex-valued neural network model that uses the back-propagation algorithm (called 'Complex-BP') for learning. Our experiments results demonstrate the effectiveness of a Radial basis function in a complex valued neural network in image recognition over a real valued neural network. We have studied and stated various observations like effect of learning rates, ranges of the initial weights randomly selected, error functions used and number of iterations for the convergence of error on a neural network model with RBF units. Some inherent properties of this complex back propagation algorithm are also studied and discussed.
Abstract: Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.
Abstract: To explore pipelines is one of various bio-mimetic
robot applications. The robot may work in common buildings such as
between ceilings and ducts, in addition to complicated and massive
pipeline systems of large industrial plants. The bio-mimetic robot finds
any troubled area or malfunction and then reports its data. Importantly,
it can not only prepare for but also react to any abnormal routes in the
pipeline. The pipeline monitoring tasks require special types of mobile
robots. For an effective movement along a pipeline, the movement of
the robot will be similar to that of insects or crawling animals. During
its movement along the pipelines, a pipeline monitoring robot has an
important task of finding the shapes of the approaching path on the
pipes. In this paper we propose an effective solution to the pipeline
pattern recognition, based on the fuzzy classification rules for the
measured IR distance data.
Abstract: In this paper we present an enhanced noise reduction method for robust speech recognition using Adaptive Gain Equalizer with Non linear Spectral Subtraction. In Adaptive Gain Equalizer method (AGE), the input signal is divided into a number of subbands that are individually weighed in time domain, in accordance to the short time Signal-to-Noise Ratio (SNR) in each subband estimation at every time instant. Instead of focusing on suppression the noise on speech enhancement is focused. When analysis was done under various noise conditions for speech recognition, it was found that Adaptive Gain Equalizer method algorithm has an obvious failing point for a SNR of -5 dB, with inadequate levels of noise suppression for SNR less than this point. This work proposes the implementation of AGE when coupled with Non linear Spectral Subtraction (AGE-NSS) for robust speech recognition. The experimental result shows that out AGE-NSS performs the AGE when SNR drops below -5db level.
Abstract: A neuron can emit spikes in an irregular time basis and by averaging over a certain time window one would ignore a lot of information. It is known that in the context of fast information processing there is no sufficient time to sample an average firing rate of the spiking neurons. The present work shows that the spiking neurons are capable of computing the radial basis functions by storing the relevant information in the neurons' delays. One of the fundamental findings of the this research also is that when using overlapping receptive fields to encode the data patterns it increases the network-s clustering capacity. The clustering algorithm that is discussed here is interesting from computer science and neuroscience point of view as well as from a perspective.
Abstract: In this paper, we propose a robust scheme to work face alignment and recognition under various influences. For face representation, illumination influence and variable expressions are the important factors, especially the accuracy of facial localization and face recognition. In order to solve those of factors, we propose a robust approach to overcome these problems. This approach consists of two phases. One phase is preprocessed for face images by means of the proposed illumination normalization method. The location of facial features can fit more efficient and fast based on the proposed image blending. On the other hand, based on template matching, we further improve the active shape models (called as IASM) to locate the face shape more precise which can gain the recognized rate in the next phase. The other phase is to process feature extraction by using principal component analysis and face recognition by using support vector machine classifiers. The results show that this proposed method can obtain good facial localization and face recognition with varied illumination and local distortion.
Abstract: Character segmentation is an important preprocessing
step for text recognition. In degraded documents, existence of
touching characters decreases recognition rate drastically, for any
optical character recognition (OCR) system. In this paper we have
proposed a complete solution for segmenting touching characters in
all the three zones of printed Gurmukhi script. A study of touching
Gurmukhi characters is carried out and these characters have been
divided into various categories after a careful analysis. Structural
properties of the Gurmukhi characters are used for defining the
categories. New algorithms have been proposed to segment the
touching characters in middle zone, upper zone and lower zone.
These algorithms have shown a reasonable improvement in
segmenting the touching characters in degraded printed Gurmukhi
script. The algorithms proposed in this paper are applicable only to
machine printed text. We have also discussed a new and useful
technique to segment the horizontally overlapping lines.
Abstract: Nowadays, web-based technologies influence in
people-s daily life such as in education, business and others.
Therefore, many web developers are too eager to develop their web
applications with fully animation graphics and forgetting its
accessibility to its users. Their purpose is to make their web
applications look impressive. Thus, this paper would highlight on the
usability and accessibility of a voice recognition browser as a tool to
facilitate the visually impaired and blind learners in accessing virtual
learning environment. More specifically, the objectives of the study
are (i) to explore the challenges faced by the visually impaired
learners in accessing virtual learning environment (ii) to determine
the suitable guidelines for developing a voice recognition browser
that is accessible to the visually impaired. Furthermore, this study
was prepared based on an observation conducted with the Malaysian
visually impaired learners. Finally, the result of this study would
underline on the development of an accessible voice recognition
browser for the visually impaired.
Abstract: Dealing with hundreds of features in character
recognition systems is not unusual. This large number of features
leads to the increase of computational workload of recognition
process. There have been many methods which try to remove
unnecessary or redundant features and reduce feature dimensionality.
Besides because of the characteristics of Farsi scripts, it-s not
possible to apply other languages algorithms to Farsi directly. In this
paper some methods for feature subset selection using genetic
algorithms are applied on a Farsi optical character recognition (OCR)
system. Experimental results show that application of genetic
algorithms (GA) to feature subset selection in a Farsi OCR results in
lower computational complexity and enhanced recognition rate.
Abstract: A comparison between the performance of Latin and
Arabic handwritten digits recognition problems is presented. The
performance of ten different classifiers is tested on two similar
Arabic and Latin handwritten digits databases. The analysis shows
that Arabic handwritten digits recognition problem is easier than that
of Latin digits. This is because the interclass difference in case of
Latin digits is smaller than in Arabic digits and variances in writing
Latin digits are larger. Consequently, weaker yet fast classifiers are
expected to play more prominent role in Arabic handwritten digits
recognition.
Abstract: SoftBoost is a recently presented boosting algorithm,
which trades off the size of achieved classification margin and
generalization performance. This paper presents a performance
evaluation of SoftBoost algorithm on the generic object recognition
problem. An appearance-based generic object recognition
model is used. The evaluation experiments are performed using
a difficult object recognition benchmark. An assessment with respect
to different degrees of label noise as well as a comparison to
the well known AdaBoost algorithm is performed. The obtained
results reveal that SoftBoost is encouraged to be used in cases
when the training data is known to have a high degree of noise.
Otherwise, using Adaboost can achieve better performance.
Abstract: In this paper a new approach to face recognition is
presented that achieves double dimension reduction, making the
system computationally efficient with better recognition results and
out perform common DCT technique of face recognition. In pattern
recognition techniques, discriminative information of image
increases with increase in resolution to a certain extent, consequently
face recognition results change with change in face image resolution
and provide optimal results when arriving at a certain resolution
level. In the proposed model of face recognition, initially image
decimation algorithm is applied on face image for dimension
reduction to a certain resolution level which provides best
recognition results. Due to increased computational speed and feature
extraction potential of Discrete Cosine Transform (DCT), it is
applied on face image. A subset of coefficients of DCT from low to
mid frequencies that represent the face adequately and provides best
recognition results is retained. A tradeoff between decimation factor,
number of DCT coefficients retained and recognition rate with
minimum computation is obtained. Preprocessing of the image is
carried out to increase its robustness against variations in poses and
illumination level. This new model has been tested on different
databases which include ORL , Yale and EME color database.
Abstract: Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.
Abstract: In pattern recognition applications the low level
segmentation and the high level object recognition are generally
considered as two separate steps. The paper presents a method that
bridges the gap between the low and the high level object
recognition. It is based on a Bayesian network representation and
network propagation algorithm. At the low level it uses hierarchical
structure of quadratic spline wavelet image bases. The method is
demonstrated for a simple circuit diagram component identification
problem.
Abstract: This paper proposes view-point insensitive human
pose recognition system using neural network. Recognition system
consists of silhouette image capturing module, data driven database,
and neural network. The advantages of our system are first, it is
possible to capture multiple view-point silhouette images of 3D human
model automatically. This automatic capture module is helpful to
reduce time consuming task of database construction. Second, we
develop huge feature database to offer view-point insensitivity at pose
recognition. Third, we use neural network to recognize human pose
from multiple-view because every pose from each model have similar
feature patterns, even though each model has different appearance and
view-point. To construct database, we need to create 3D human model
using 3D manipulate tools. Contour shape is used to convert silhouette
image to feature vector of 12 degree. This extraction task is processed
semi-automatically, which benefits in that capturing images and
converting to silhouette images from the real capturing environment is
needless. We demonstrate the effectiveness of our approach with
experiments on virtual environment.
Abstract: Finger spelling is an art of communicating by signs
made with fingers, and has been introduced into sign language to serve
as a bridge between the sign language and the verbal language.
Previous approaches to finger spelling recognition are classified into
two categories: glove-based and vision-based approaches. The
glove-based approach is simpler and more accurate recognizing work
of hand posture than vision-based, yet the interfaces require the user to
wear a cumbersome and carry a load of cables that connected the
device to a computer. In contrast, the vision-based approaches provide
an attractive alternative to the cumbersome interface, and promise
more natural and unobtrusive human-computer interaction. The
vision-based approaches generally consist of two steps: hand
extraction and recognition, and two steps are processed independently.
This paper proposes real-time vision-based Korean finger spelling
recognition system by integrating hand extraction into recognition.
First, we tentatively detect a hand region using CAMShift algorithm.
Then fill factor and aspect ratio estimated by width and height
estimated by CAMShift are used to choose candidate from database,
which can reduce the number of matching in recognition step. To
recognize the finger spelling, we use DTW(dynamic time warping)
based on modified chain codes, to be robust to scale and orientation
variations. In this procedure, since accurate hand regions, without
holes and noises, should be extracted to improve the precision, we use
graph cuts algorithm that globally minimize the energy function
elegantly expressed by Markov random fields (MRFs). In the
experiments, the computational times are less than 130ms, and the
times are not related to the number of templates of finger spellings in
database, as candidate templates are selected in extraction step.
Abstract: In this paper, we propose an improved 3D star skeleton
technique, which is a suitable skeletonization for human posture representation
and reflects the 3D information of human posture.
Moreover, the proposed technique is simple and then can be performed
in real-time. The existing skeleton construction techniques, such as
distance transformation, Voronoi diagram, and thinning, focus on the
precision of skeleton information. Therefore, those techniques are not
applicable to real-time posture recognition since they are computationally
expensive and highly susceptible to noise of boundary. Although
a 2D star skeleton was proposed to complement these problems,
it also has some limitations to describe the 3D information of the
posture. To represent human posture effectively, the constructed skeleton
should consider the 3D information of posture. The proposed 3D
star skeleton contains 3D data of human, and focuses on human action
and posture recognition. Our 3D star skeleton uses the 8 projection
maps which have 2D silhouette information and depth data of human
surface. And the extremal points can be extracted as the features of 3D
star skeleton, without searching whole boundary of object. Therefore,
on execution time, our 3D star skeleton is faster than the “greedy" 3D
star skeleton using the whole boundary points on the surface. Moreover,
our method can offer more accurate skeleton of posture than the
existing star skeleton since the 3D data for the object is concerned.
Additionally, we make a codebook, a collection of representative 3D
star skeletons about 7 postures, to recognize what posture of constructed
skeleton is.
Abstract: In this article, we expose our research work in
Human-machine Interaction. The research consists in manipulating
the workspace by eyes. We present some of our results, in particular
the detection of eyes and the mouse actions recognition. Indeed, the
handicaped user becomes able to interact with the machine in a more
intuitive way in diverse applications and contexts. To test our
application we have chooses to work in real time on videos captured
by a camera placed in front of the user.
Abstract: In this work the opportunity of construction of the
qualifiers for face-recognition systems based on conjugation criteria
is investigated. The linkage between the bipartite conjugation, the
conjugation with a subspace and the conjugation with the null-space
is shown. The unified solving rule is investigated. It makes the
decision on the rating of face to a class considering the linkage
between conjugation values. The described recognition method can
be successfully applied to the distributed systems of video control
and video observation.