Abstract: In this paper, we propose a robust scheme to work face alignment and recognition under various influences. For face representation, illumination influence and variable expressions are the important factors, especially the accuracy of facial localization and face recognition. In order to solve those of factors, we propose a robust approach to overcome these problems. This approach consists of two phases. One phase is preprocessed for face images by means of the proposed illumination normalization method. The location of facial features can fit more efficient and fast based on the proposed image blending. On the other hand, based on template matching, we further improve the active shape models (called as IASM) to locate the face shape more precise which can gain the recognized rate in the next phase. The other phase is to process feature extraction by using principal component analysis and face recognition by using support vector machine classifiers. The results show that this proposed method can obtain good facial localization and face recognition with varied illumination and local distortion.
Abstract: Character segmentation is an important preprocessing
step for text recognition. In degraded documents, existence of
touching characters decreases recognition rate drastically, for any
optical character recognition (OCR) system. In this paper we have
proposed a complete solution for segmenting touching characters in
all the three zones of printed Gurmukhi script. A study of touching
Gurmukhi characters is carried out and these characters have been
divided into various categories after a careful analysis. Structural
properties of the Gurmukhi characters are used for defining the
categories. New algorithms have been proposed to segment the
touching characters in middle zone, upper zone and lower zone.
These algorithms have shown a reasonable improvement in
segmenting the touching characters in degraded printed Gurmukhi
script. The algorithms proposed in this paper are applicable only to
machine printed text. We have also discussed a new and useful
technique to segment the horizontally overlapping lines.
Abstract: Nowadays, web-based technologies influence in
people-s daily life such as in education, business and others.
Therefore, many web developers are too eager to develop their web
applications with fully animation graphics and forgetting its
accessibility to its users. Their purpose is to make their web
applications look impressive. Thus, this paper would highlight on the
usability and accessibility of a voice recognition browser as a tool to
facilitate the visually impaired and blind learners in accessing virtual
learning environment. More specifically, the objectives of the study
are (i) to explore the challenges faced by the visually impaired
learners in accessing virtual learning environment (ii) to determine
the suitable guidelines for developing a voice recognition browser
that is accessible to the visually impaired. Furthermore, this study
was prepared based on an observation conducted with the Malaysian
visually impaired learners. Finally, the result of this study would
underline on the development of an accessible voice recognition
browser for the visually impaired.
Abstract: Dealing with hundreds of features in character
recognition systems is not unusual. This large number of features
leads to the increase of computational workload of recognition
process. There have been many methods which try to remove
unnecessary or redundant features and reduce feature dimensionality.
Besides because of the characteristics of Farsi scripts, it-s not
possible to apply other languages algorithms to Farsi directly. In this
paper some methods for feature subset selection using genetic
algorithms are applied on a Farsi optical character recognition (OCR)
system. Experimental results show that application of genetic
algorithms (GA) to feature subset selection in a Farsi OCR results in
lower computational complexity and enhanced recognition rate.
Abstract: A comparison between the performance of Latin and
Arabic handwritten digits recognition problems is presented. The
performance of ten different classifiers is tested on two similar
Arabic and Latin handwritten digits databases. The analysis shows
that Arabic handwritten digits recognition problem is easier than that
of Latin digits. This is because the interclass difference in case of
Latin digits is smaller than in Arabic digits and variances in writing
Latin digits are larger. Consequently, weaker yet fast classifiers are
expected to play more prominent role in Arabic handwritten digits
recognition.
Abstract: SoftBoost is a recently presented boosting algorithm,
which trades off the size of achieved classification margin and
generalization performance. This paper presents a performance
evaluation of SoftBoost algorithm on the generic object recognition
problem. An appearance-based generic object recognition
model is used. The evaluation experiments are performed using
a difficult object recognition benchmark. An assessment with respect
to different degrees of label noise as well as a comparison to
the well known AdaBoost algorithm is performed. The obtained
results reveal that SoftBoost is encouraged to be used in cases
when the training data is known to have a high degree of noise.
Otherwise, using Adaboost can achieve better performance.
Abstract: In this paper a new approach to face recognition is
presented that achieves double dimension reduction, making the
system computationally efficient with better recognition results and
out perform common DCT technique of face recognition. In pattern
recognition techniques, discriminative information of image
increases with increase in resolution to a certain extent, consequently
face recognition results change with change in face image resolution
and provide optimal results when arriving at a certain resolution
level. In the proposed model of face recognition, initially image
decimation algorithm is applied on face image for dimension
reduction to a certain resolution level which provides best
recognition results. Due to increased computational speed and feature
extraction potential of Discrete Cosine Transform (DCT), it is
applied on face image. A subset of coefficients of DCT from low to
mid frequencies that represent the face adequately and provides best
recognition results is retained. A tradeoff between decimation factor,
number of DCT coefficients retained and recognition rate with
minimum computation is obtained. Preprocessing of the image is
carried out to increase its robustness against variations in poses and
illumination level. This new model has been tested on different
databases which include ORL , Yale and EME color database.
Abstract: Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.
Abstract: In pattern recognition applications the low level
segmentation and the high level object recognition are generally
considered as two separate steps. The paper presents a method that
bridges the gap between the low and the high level object
recognition. It is based on a Bayesian network representation and
network propagation algorithm. At the low level it uses hierarchical
structure of quadratic spline wavelet image bases. The method is
demonstrated for a simple circuit diagram component identification
problem.
Abstract: This paper proposes view-point insensitive human
pose recognition system using neural network. Recognition system
consists of silhouette image capturing module, data driven database,
and neural network. The advantages of our system are first, it is
possible to capture multiple view-point silhouette images of 3D human
model automatically. This automatic capture module is helpful to
reduce time consuming task of database construction. Second, we
develop huge feature database to offer view-point insensitivity at pose
recognition. Third, we use neural network to recognize human pose
from multiple-view because every pose from each model have similar
feature patterns, even though each model has different appearance and
view-point. To construct database, we need to create 3D human model
using 3D manipulate tools. Contour shape is used to convert silhouette
image to feature vector of 12 degree. This extraction task is processed
semi-automatically, which benefits in that capturing images and
converting to silhouette images from the real capturing environment is
needless. We demonstrate the effectiveness of our approach with
experiments on virtual environment.
Abstract: Finger spelling is an art of communicating by signs
made with fingers, and has been introduced into sign language to serve
as a bridge between the sign language and the verbal language.
Previous approaches to finger spelling recognition are classified into
two categories: glove-based and vision-based approaches. The
glove-based approach is simpler and more accurate recognizing work
of hand posture than vision-based, yet the interfaces require the user to
wear a cumbersome and carry a load of cables that connected the
device to a computer. In contrast, the vision-based approaches provide
an attractive alternative to the cumbersome interface, and promise
more natural and unobtrusive human-computer interaction. The
vision-based approaches generally consist of two steps: hand
extraction and recognition, and two steps are processed independently.
This paper proposes real-time vision-based Korean finger spelling
recognition system by integrating hand extraction into recognition.
First, we tentatively detect a hand region using CAMShift algorithm.
Then fill factor and aspect ratio estimated by width and height
estimated by CAMShift are used to choose candidate from database,
which can reduce the number of matching in recognition step. To
recognize the finger spelling, we use DTW(dynamic time warping)
based on modified chain codes, to be robust to scale and orientation
variations. In this procedure, since accurate hand regions, without
holes and noises, should be extracted to improve the precision, we use
graph cuts algorithm that globally minimize the energy function
elegantly expressed by Markov random fields (MRFs). In the
experiments, the computational times are less than 130ms, and the
times are not related to the number of templates of finger spellings in
database, as candidate templates are selected in extraction step.
Abstract: In this paper, we propose an improved 3D star skeleton
technique, which is a suitable skeletonization for human posture representation
and reflects the 3D information of human posture.
Moreover, the proposed technique is simple and then can be performed
in real-time. The existing skeleton construction techniques, such as
distance transformation, Voronoi diagram, and thinning, focus on the
precision of skeleton information. Therefore, those techniques are not
applicable to real-time posture recognition since they are computationally
expensive and highly susceptible to noise of boundary. Although
a 2D star skeleton was proposed to complement these problems,
it also has some limitations to describe the 3D information of the
posture. To represent human posture effectively, the constructed skeleton
should consider the 3D information of posture. The proposed 3D
star skeleton contains 3D data of human, and focuses on human action
and posture recognition. Our 3D star skeleton uses the 8 projection
maps which have 2D silhouette information and depth data of human
surface. And the extremal points can be extracted as the features of 3D
star skeleton, without searching whole boundary of object. Therefore,
on execution time, our 3D star skeleton is faster than the “greedy" 3D
star skeleton using the whole boundary points on the surface. Moreover,
our method can offer more accurate skeleton of posture than the
existing star skeleton since the 3D data for the object is concerned.
Additionally, we make a codebook, a collection of representative 3D
star skeletons about 7 postures, to recognize what posture of constructed
skeleton is.
Abstract: In this article, we expose our research work in
Human-machine Interaction. The research consists in manipulating
the workspace by eyes. We present some of our results, in particular
the detection of eyes and the mouse actions recognition. Indeed, the
handicaped user becomes able to interact with the machine in a more
intuitive way in diverse applications and contexts. To test our
application we have chooses to work in real time on videos captured
by a camera placed in front of the user.
Abstract: In this work the opportunity of construction of the
qualifiers for face-recognition systems based on conjugation criteria
is investigated. The linkage between the bipartite conjugation, the
conjugation with a subspace and the conjugation with the null-space
is shown. The unified solving rule is investigated. It makes the
decision on the rating of face to a class considering the linkage
between conjugation values. The described recognition method can
be successfully applied to the distributed systems of video control
and video observation.
Abstract: Automatic Vehicle Identification (AVI) has many
applications in traffic systems (highway electronic toll collection, red
light violation enforcement, border and customs checkpoints, etc.).
License Plate Recognition is an effective form of AVI systems. In
this study, a smart and simple algorithm is presented for vehicle-s
license plate recognition system. The proposed algorithm consists of
three major parts: Extraction of plate region, segmentation of
characters and recognition of plate characters. For extracting the
plate region, edge detection algorithms and smearing algorithms are
used. In segmentation part, smearing algorithms, filtering and some
morphological algorithms are used. And finally statistical based
template matching is used for recognition of plate characters. The
performance of the proposed algorithm has been tested on real
images. Based on the experimental results, we noted that our
algorithm shows superior performance in car license plate
recognition.
Abstract: Human computer interaction has progressed
considerably from the traditional modes of interaction. Vision based
interfaces are a revolutionary technology, allowing interaction
through human actions, gestures. Researchers have developed
numerous accurate techniques, however, with an exception to few
these techniques are not evaluated using standard HCI techniques. In
this paper we present a comprehensive framework to address this
issue. Our evaluation of a computer vision application shows that in
addition to the accuracy, it is vital to address human factors
Abstract: Optical Character Recognition (OCR) is a very old and of great interest in pattern recognition field. In this paper we introduce a very powerful approach to recognize Persian text. We have used morphological operators, especially Hit/Miss operator to descript each sub-word and by using a template matching approach we have tried to classify generated description. We used just one font in two different sizes to verify our approach. We achieved a very good rate, up to 99.9%.
Abstract: This paper presents a boarding on biometric
authentication through the Keystrokes Dynamics that it intends to
identify a person from its habitual rhythm to type in conventional
keyboard. Seven done experiments: verifying amount of prototypes,
threshold, features and the variation of the choice of the times of the
features vector. The results show that the use of the Keystroke
Dynamics is simple and efficient for personal authentication, getting
optimum resulted using 90% of the features with 4.44% FRR and 0%
FAR.
Abstract: Robust face recognition under various illumination
environments is very difficult and needs to be accomplished for
successful commercialization. In this paper, we propose an improved
illumination normalization method for face recognition. Illumination
normalization algorithm based on anisotropic smoothing is well known
to be effective among illumination normalization methods but
deteriorates the intensity contrast of the original image, and incurs less
sharp edges. The proposed method in this paper improves the previous
anisotropic smoothing-based illumination normalization method so
that it increases the intensity contrast and enhances the edges while
diminishing the effect of illumination variations. Due to the result of
these improvements, face images preprocessed by the proposed
illumination normalization method becomes to have more distinctive
feature vectors (Gabor feature vectors) for face recognition. Through
experiments of face recognition based on Gabor feature vector
similarity, the effectiveness of the proposed illumination
normalization method is verified.
Abstract: The γ-turns play important roles in protein folding and
molecular recognition. The prediction and analysis of γ-turn types are
important for both protein structure predictions and better
understanding the characteristics of different γ-turn types. This study
proposed a physicochemical property-based decision tree (PPDT)
method to interpretably predict γ-turn types. In addition to the good
prediction performance of PPDT, three simple and human
interpretable IF-THEN rules are extracted from the decision tree
constructed by PPDT. The identified informative physicochemical
properties and concise rules provide a simple way for discriminating
and understanding γ-turn types.