Abstract: Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis.Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text.
Abstract: Automatic currency note recognition invariably
depends on the currency note characteristics of a particular country
and the extraction of features directly affects the recognition ability.
Sri Lanka has not been involved in any kind of research or
implementation of this kind. The proposed system “SLCRec" comes
up with a solution focusing on minimizing false rejection of notes.
Sri Lankan currency notes undergo severe changes in image quality
in usage. Hence a special linear transformation function is adapted to
wipe out noise patterns from backgrounds without affecting the
notes- characteristic images and re-appear images of interest. The
transformation maps the original gray scale range into a smaller
range of 0 to 125. Applying Edge detection after the transformation
provided better robustness for noise and fair representation of edges
for new and old damaged notes. A three layer back propagation
neural network is presented with the number of edges detected in row
order of the notes and classification is accepted in four classes of
interest which are 100, 500, 1000 and 2000 rupee notes. The
experiments showed good classification results and proved that the
proposed methodology has the capability of separating classes
properly in varying image conditions.
Abstract: This paper proposed high level feature for online Lao handwritten recognition. This feature must be high level enough so that the feature is not change when characters are written by different persons at different speed and different proportion (shorter or longer stroke, head, tail, loop, curve). In this high level feature, a character is divided in to sequence of curve segments where a segment start where curve reverse rotation (counter clockwise and clockwise). In each segment, following features are gathered cumulative change in direction of curve (- for clockwise), cumulative curve length, cumulative length of left to right, right to left, top to bottom and bottom to top ( cumulative change in X and Y axis of segment). This feature is simple yet robust for high accuracy recognition. The feature can be gather from parsing the original time sampling sequence X, Y point of the pen location without re-sampling. We also experiment on other segmentation point such as the maximum curvature point which was widely used by other researcher. Experiments results show that the recognition rates are at 94.62% in comparing to using maximum curvature point 75.07%. This is due to a lot of variations of turning points in handwritten.
Abstract: Fuzzy C-means Clustering algorithm (FCM) is a
method that is frequently used in pattern recognition. It has the
advantage of giving good modeling results in many cases, although,
it is not capable of specifying the number of clusters by itself. In
FCM algorithm most researchers fix weighting exponent (m) to a
conventional value of 2 which might not be the appropriate for all
applications. Consequently, the main objective of this paper is to use
the subtractive clustering algorithm to provide the optimal number of
clusters needed by FCM algorithm by optimizing the parameters of
the subtractive clustering algorithm by an iterative search approach
and then to find an optimal weighting exponent (m) for the FCM
algorithm. In order to get an optimal number of clusters, the iterative
search approach is used to find the optimal single-output Sugenotype
Fuzzy Inference System (FIS) model by optimizing the
parameters of the subtractive clustering algorithm that give minimum
least square error between the actual data and the Sugeno fuzzy
model. Once the number of clusters is optimized, then two
approaches are proposed to optimize the weighting exponent (m) in
the FCM algorithm, namely, the iterative search approach and the
genetic algorithms. The above mentioned approach is tested on the
generated data from the original function and optimal fuzzy models
are obtained with minimum error between the real data and the
obtained fuzzy models.
Abstract: This paper reports a new pattern recognition approach for face recognition. The biological model of light receptors - cones and rods in human eyes and the way they are associated with pattern vision in human vision forms the basis of this approach. The functional model is simulated using CWD and WPD. The paper also discusses the experiments performed for face recognition using the features extracted from images in the AT & T face database. Artificial Neural Network and k- Nearest Neighbour classifier algorithms are employed for the recognition purpose. A feature vector is formed for each of the face images in the database and recognition accuracies are computed and compared using the classifiers. Simulation results show that the proposed method outperforms traditional way of feature extraction methods prevailing for pattern recognition in terms of recognition accuracy for face images with pose and illumination variations.
Abstract: Linearization of graph embedding has been emerged
as an effective dimensionality reduction technique in pattern
recognition. However, it may not be optimal for nonlinearly
distributed real world data, such as face, due to its linear nature. So, a
kernelization of graph embedding is proposed as a dimensionality
reduction technique in face recognition. In order to further boost the
recognition capability of the proposed technique, the Fisher-s
criterion is opted in the objective function for better data
discrimination. The proposed technique is able to characterize the
underlying intra-class structure as well as the inter-class separability.
Experimental results on FRGC database validate the effectiveness of
the proposed technique as a feature descriptor.
Abstract: We report in this paper the procedure of a system of
automatic speech recognition based on techniques of the dynamic
programming. The technique of temporal retiming is a technique
used to synchronize between two forms to compare. We will see how
this technique is adapted to the field of the automatic speech
recognition. We will expose, in a first place, the theory of the
function of retiming which is used to compare and to adjust an
unknown form with a whole of forms of reference constituting the
vocabulary of the application. Then we will give, in the second place,
the various algorithms necessary to their implementation on machine.
The algorithms which we will present were tested on part of the
corpus of words in Arab language Arabdic-10 [4] and gave whole
satisfaction. These algorithms are effective insofar as we apply them
to the small ones or average vocabularies.
Abstract: Face recognition in the infrared spectrum has attracted a lot of interest in recent years. Many of the techniques used in infrared are based on their visible counterpart, especially linear techniques like PCA and LDA. In this work, we introduce a probabilistic Bayesian framework for face recognition in the infrared spectrum. In the infrared spectrum, variations can occur between face images of the same individual due to pose, metabolic, time changes, etc. Bayesian approaches permit to reduce intrapersonal variation, thus making them very interesting for infrared face recognition. This framework is compared with classical linear techniques. Non linear techniques we developed recently for infrared face recognition are also presented and compared to the Bayesian face recognition framework. A new approach for infrared face extraction based on SVM is introduced. Experimental results show that the Bayesian technique is promising and lead to interesting results in the infrared spectrum when a sufficient number of face images is used in an intrapersonal learning process.
Abstract: In the present study, a support vector machine (SVM) learning approach to character recognition is proposed. Simple
feature detectors, similar to those found in the human visual system, were used in the SVM classifier. Alphabetic characters were rotated
to 8 different angles and using the proposed cognitive model, all characters were recognized with 100% accuracy and specificity.
These same results were found in psychiatric studies of human character recognition.
Abstract: Recognition of characters greatly depends upon the features used. Several features of the handwritten Arabic characters are selected and discussed. An off-line recognition system based on the selected features was built. The system was trained and tested with realistic samples of handwritten Arabic characters. Evaluation of the importance and accuracy of the selected features is made. The recognition based on the selected features give average accuracies of 88% and 70% for the numbers and letters, respectively. Further improvements are achieved by using feature weights based on insights gained from the accuracies of individual features.
Abstract: We have applied new accelerated algorithm for linear
discriminate analysis (LDA) in face recognition with support vector
machine. The new algorithm has the advantage of optimal selection
of the step size. The gradient descent method and new algorithm has
been implemented in software and evaluated on the Yale face
database B. The eigenfaces of these approaches have been used to
training a KNN. Recognition rate with new algorithm is compared
with gradient.
Abstract: An approach is offered for more precise definition of base lines- borders in handwritten cursive text and general problems of handwritten text segmentation have also been analyzed. An offered method tries to solve problems arose in handwritten recognition with specific slant or in other words, where the letters of the words are not on the same vertical line. As an informative features, some recognition systems use ascending and descending parts of the letters, found after the word-s baseline detection. In such recognition systems, problems in baseline detection, impacts the quality of the recognition and decreases the rate of the recognition. Despite other methods, here borders are found by small pieces containing segmentation elements and defined as a set of linear functions. In this method, separate borders for top and bottom border lines are found. At the end of the paper, as a result, azerbaijani cursive handwritten texts written in Latin alphabet by different authors has been analyzed.
Abstract: The intention of this lessons is to assess the probability
of optical coherence tomography (OCT) for biometric recognition.
The OCT is the foundation on an optical signal acquisition and
processing method and has the micrometer-resolution. In this study,
we used the porcine skin for verifying the abovementioned means. The
porcine tissue was sound acknowledged for structural and
immunohistochemical similarity with human skin, so it could be
suitable for pre-clinical trial as investigational specimen. For this
reason, it was tattooed by the tattoo machine with the tattoo-pigment.
We detected the pattern of the tattooed skin by the OCT according to
needle speed. The result was consistent with the histology images.
This result showed that the OCT was effective to examine the tattooed
skin section noninvasively. It might be available to identify
morphological changes inside the skin.
Abstract: Discrimination between different classes of environmental
sounds is the goal of our work. The use of a sound recognition
system can offer concrete potentialities for surveillance and
security applications. The first paper contribution to this research
field is represented by a thorough investigation of the applicability
of state-of-the-art audio features in the domain of environmental
sound recognition. Additionally, a set of novel features obtained by
combining the basic parameters is introduced. The quality of the
features investigated is evaluated by a HMM-based classifier to which
a great interest was done. In fact, we propose to use a Multi-Style
training system based on HMMs: one recognizer is trained on a
database including different levels of background noises and is used
as a universal recognizer for every environment. In order to enhance
the system robustness by reducing the environmental variability, we
explore different adaptation algorithms including Maximum Likelihood
Linear Regression (MLLR), Maximum A Posteriori (MAP)
and the MAP/MLLR algorithm that combines MAP and MLLR.
Experimental evaluation shows that a rather good recognition rate
can be reached, even under important noise degradation conditions
when the system is fed by the convenient set of features.
Abstract: In illumination variant face recognition, existing
methods extracting face albedo as light normalized image may lead to
loss of extensive facial details, with light template discarded. To
improve that, a novel approach for realistic facial texture
reconstruction by combining original image and albedo image is
proposed. First, light subspaces of different identities are established
from the given reference face images; then by projecting the original
and albedo image into each light subspace respectively, texture
reference images with corresponding lighting are reconstructed and
two texture subspaces are formed. According to the projections in
texture subspaces, facial texture with normal light can be synthesized.
Due to the combination of original image, facial details can be
preserved with face albedo. In addition, image partition is applied to
improve the synthesization performance. Experiments on Yale B and
CMUPIE databases demonstrate that this algorithm outperforms the
others both in image representation and in face recognition.
Abstract: The human friendly interaction is the key function of a human-centered system. Over the years, it has received much attention to develop the convenient interaction through intention recognition. Intention recognition processes multimodal inputs including speech, face images, and body gestures. In this paper, we suggest a novel approach of intention recognition using a graph representation called Intention Graph. A concept of valid intention is proposed, as a target of intention recognition. Our approach has two phases: goal recognition phase and intention recognition phase. In the goal recognition phase, we generate an action graph based on the observed actions, and then the candidate goals and their plans are recognized. In the intention recognition phase, the intention is recognized with relevant goals and user profile. We show that the algorithm has polynomial time complexity. The intention graph is applied to a simple briefcase domain to test our model.
Abstract: The use of human hand as a natural interface for humancomputer interaction (HCI) serves as the motivation for research in hand gesture recognition. Vision-based hand gesture recognition involves visual analysis of hand shape, position and/or movement. In this paper, we use the concept of object-based video abstraction for segmenting the frames into video object planes (VOPs), as used in MPEG-4, with each VOP corresponding to one semantically meaningful hand position. Next, the key VOPs are selected on the basis of the amount of change in hand shape – for a given key frame in the sequence the next key frame is the one in which the hand changes its shape significantly. Thus, an entire video clip is transformed into a small number of representative frames that are sufficient to represent a gesture sequence. Subsequently, we model a particular gesture as a sequence of key frames each bearing information about its duration. These constitute a finite state machine. For recognition, the states of the incoming gesture sequence are matched with the states of all different FSMs contained in the database of gesture vocabulary. The core idea of our proposed representation is that redundant frames of the gesture video sequence bear only the temporal information of a gesture and hence discarded for computational efficiency. Experimental results obtained demonstrate the effectiveness of our proposed scheme for key frame extraction, subsequent gesture summarization and finally gesture recognition.
Abstract: Random and natural textures classification is still
one of the biggest challenges in the field of image processing and
pattern recognition. In this paper, texture feature extraction using
Slant Hadamard Transform was studied and compared to other
signal processing-based texture classification schemes. A
parametric SHT was also introduced and employed for natural
textures feature extraction. We showed that a subtly modified
parametric SHT can outperform ordinary Walsh-Hadamard
transform and discrete cosine transform. Experiments were carried
out on a subset of Vistex random natural texture images using a
kNN classifier.
Abstract: Matching algorithms have significant importance in
speaker recognition. Feature vectors of the unknown utterance are
compared to feature vectors of the modeled speakers as a last step in
speaker recognition. A similarity score is found for every model in
the speaker database. Depending on the type of speaker recognition,
these scores are used to determine the author of unknown speech
samples. For speaker verification, similarity score is tested against a
predefined threshold and either acceptance or rejection result is
obtained. In the case of speaker identification, the result depends on
whether the identification is open set or closed set. In closed set
identification, the model that yields the best similarity score is
accepted. In open set identification, the best score is tested against a
threshold, so there is one more possible output satisfying the
condition that the speaker is not one of the registered speakers in
existing database. This paper focuses on closed set speaker
identification using a modified version of a well known matching
algorithm. The results of new matching algorithm indicated better
performance on YOHO international speaker recognition database.
Abstract: Gabor-based face representation has achieved enormous success in face recognition. This paper addresses a novel algorithm for face recognition using neural networks trained by Gabor features. The system is commenced on convolving a face image with a series of Gabor filter coefficients at different scales and orientations. Two novel contributions of this paper are: scaling of rms contrast and introduction of fuzzily skewed filter. The neural network employed for face recognition is based on the multilayer perceptron (MLP) architecture with backpropagation algorithm and incorporates the convolution filter response of Gabor jet. The effectiveness of the algorithm has been justified over a face database with images captured at different illumination conditions.