Abstract: Advancement in Artificial Intelligence has lead to the
developments of various “smart" devices. Character recognition
device is one of such smart devices that acquire partial human
intelligence with the ability to capture and recognize various
characters in different languages. Firstly multiscale neural training
with modifications in the input training vectors is adopted in this
paper to acquire its advantage in training higher resolution character
images. Secondly selective thresholding using minimum distance
technique is proposed to be used to increase the level of accuracy of
character recognition. A simulator program (a GUI) is designed in
such a way that the characters can be located on any spot on the
blank paper in which the characters are written. The results show that
such methods with moderate level of training epochs can produce
accuracies of at least 85% and more for handwritten upper case
English characters and numerals.
Abstract: In digital signal processing it is important to
approximate multi-dimensional data by the method called rank
reduction, in which we reduce the rank of multi-dimensional data from
higher to lower. For 2-dimennsional data, singular value
decomposition (SVD) is one of the most known rank reduction
techniques. Additional, outer product expansion expanded from SVD
was proposed and implemented for multi-dimensional data, which has
been widely applied to image processing and pattern recognition.
However, the multi-dimensional outer product expansion has behavior
of great computation complex and has not orthogonally between the
expansion terms. Therefore we have proposed an alterative method,
Third-order Orthogonal Tensor Product Expansion short for 3-OTPE.
3-OTPE uses the power method instead of nonlinear optimization
method for decreasing at computing time. At the same time the group
of B. D. Lathauwer proposed Higher-Order SVD (HOSVD) that is
also developed with SVD extensions for multi-dimensional data.
3-OTPE and HOSVD are similarly on the rank reduction of
multi-dimensional data. Using these two methods we can obtain
computation results respectively, some ones are the same while some
ones are slight different. In this paper, we compare 3-OTPE to
HOSVD in accuracy of calculation and computing time of resolution,
and clarify the difference between these two methods.
Abstract: One major source of performance decline in speaker
recognition system is channel mismatch between training and testing.
This paper focuses on improving channel robustness of speaker
recognition system in two aspects of channel compensation technique
and channel robust features. The system is text-independent speaker
identification system based on two-stage recognition. In the aspect of
channel compensation technique, this paper applies MAP (Maximum
A Posterior Probability) channel compensation technique, which was
used in speech recognition, to speaker recognition system. In the
aspect of channel robust features, this paper introduces
pitch-dependent features and pitch-dependent speaker model for the
second stage recognition. Based on the first stage recognition to
testing speech using GMM (Gaussian Mixture Model), the system
uses GMM scores to decide if it needs to be recognized again. If it
needs to, the system selects a few speakers from all of the speakers
who participate in the first stage recognition for the second stage
recognition. For each selected speaker, the system obtains 3
pitch-dependent results from his pitch-dependent speaker model, and
then uses ANN (Artificial Neural Network) to unite the 3
pitch-dependent results and 1 GMM score for getting a fused result.
The system makes the second stage recognition based on these fused
results. The experiments show that the correct rate of two-stage
recognition system based on MAP channel compensation technique
and pitch-dependent features is 41.7% better than the baseline system
for closed-set test.
Abstract: This paper presents a recognition system for isolated
words like robot commands. It’s carried out by Time Delay Neural
Networks; TDNN. To teleoperate a robot for specific tasks as turn,
close, etc… In industrial environment and taking into account the
noise coming from the machine. The choice of TDNN is based on its
generalization in terms of accuracy, in more it acts as a filter that
allows the passage of certain desirable frequency characteristics of
speech; the goal is to determine the parameters of this filter for
making an adaptable system to the variability of speech signal and to
noise especially, for this the back propagation technique was used in
learning phase. The approach was applied on commands pronounced
in two languages separately: The French and Arabic. The results for
two test bases of 300 spoken words for each one are 87%, 97.6% in
neutral environment and 77.67%, 92.67% when the white Gaussian
noisy was added with a SNR of 35 dB.
Abstract: Hand gesture is an active area of research in the vision
community, mainly for the purpose of sign language recognition and
Human Computer Interaction. In this paper, we propose a system to
recognize alphabet characters (A-Z) and numbers (0-9) in real-time
from stereo color image sequences using Hidden Markov Models
(HMMs). Our system is based on three main stages; automatic segmentation
and preprocessing of the hand regions, feature extraction
and classification. In automatic segmentation and preprocessing stage,
color and 3D depth map are used to detect hands where the hand
trajectory will take place in further step using Mean-shift algorithm
and Kalman filter. In the feature extraction stage, 3D combined features
of location, orientation and velocity with respected to Cartesian
systems are used. And then, k-means clustering is employed for
HMMs codeword. The final stage so-called classification, Baum-
Welch algorithm is used to do a full train for HMMs parameters.
The gesture of alphabets and numbers is recognized using Left-Right
Banded model in conjunction with Viterbi algorithm. Experimental
results demonstrate that, our system can successfully recognize hand
gestures with 98.33% recognition rate.
Abstract: In this paper, a novel algorithm based on Ridgelet
Transform and support vector machine is proposed for human action
recognition. The Ridgelet transform is a directional multi-resolution
transform and it is more suitable for describing the human action by
performing its directional information to form spatial features
vectors. The dynamic transition between the spatial features is carried
out using both the Principal Component Analysis and clustering
algorithm K-means. First, the Principal Component Analysis is used
to reduce the dimensionality of the obtained vectors. Then, the kmeans
algorithm is then used to perform the obtained vectors to form
the spatio-temporal pattern, called set-of-labels, according to given
periodicity of human action. Finally, a Support Machine classifier is
used to discriminate between the different human actions. Different
tests are conducted on popular Datasets, such as Weizmann and
KTH. The obtained results show that the proposed method provides
more significant accuracy rate and it drives more robustness in very
challenging situations such as lighting changes, scaling and dynamic
environment
Abstract: A word recognition architecture based on a network
of neural associative memories and hidden Markov models has been
developed. The input stream, composed of subword-units like wordinternal
triphones consisting of diphones and triphones, is provided
to the network of neural associative memories by hidden Markov
models. The word recognition network derives words from this input
stream. The architecture has the ability to handle ambiguities on
subword-unit level and is also able to add new words to the
vocabulary during performance. The architecture is implemented to
perform the word recognition task in a language processing system
for understanding simple command sentences like “bot show apple".
Abstract: On-line handwritten scripts are usually dealt with pen tip traces from pen-down to pen-up positions. Time evaluation of the pen coordinates is also considered along with trajectory information. However, the data obtained needs a lot of preprocessing including filtering, smoothing, slant removing and size normalization before recognition process. Instead of doing such lengthy preprocessing, this paper presents a simple approach to extract the useful character information. This work evaluates the use of the counter- propagation neural network (CPN) and presents feature extraction mechanism in full detail to work with on-line handwriting recognition. The obtained recognition rates were 60% to 94% using the CPN for different sets of character samples. This paper also describes a performance study in which a recognition mechanism with multiple thresholds is evaluated for counter-propagation architecture. The results indicate that the application of multiple thresholds has significant effect on recognition mechanism. The method is applicable for off-line character recognition as well. The technique is tested for upper-case English alphabets for a number of different styles from different peoples.
Abstract: Computerized alarm systems have been applied
increasingly to nuclear power plants. For existing plants, an add-on
computer alarm system is often installed to the control rooms. Alarm
avalanches during the plant transients are major problems with the
alarm systems in nuclear power plants. Computerized alarm systems
can process alarms to reduce the number of alarms during the plant
transients. This paper describes various alarm processing methods, an
alarm cause tracking function, and various alarm presentation schemes
to show alarm information to the operators effectively which are
considered during the development of several computerized alarm
systems for Korean nuclear power plants and are found to be helpful to
the operators.
Abstract: In this paper, we present a comparative study between two computer vision systems for objects recognition and tracking, these algorithms describe two different approach based on regions constituted by a set of pixels which parameterized objects in shot sequences. For the image segmentation and objects detection, the FCM technique is used, the overlapping between cluster's distribution is minimized by the use of suitable color space (other that the RGB one). The first technique takes into account a priori probabilities governing the computation of various clusters to track objects. A Parzen kernel method is described and allows identifying the players in each frame, we also show the importance of standard deviation value research of the Gaussian probability density function. Region matching is carried out by an algorithm that operates on the Mahalanobis distance between region descriptors in two subsequent frames and uses singular value decomposition to compute a set of correspondences satisfying both the principle of proximity and the principle of exclusion.
Abstract: The purpose of this study is to identify ideal urban
design elements of waterfronts and to analyze the differences in users-
cognition among these elements. This study follows three steps as
following: first is identifying the urban design elements of waterfronts
from literature review and second is evaluating intended users-
cognition of urban design elements in urban waterfronts. Lastly, third
is analyzing the users- cognition differences. As the result, evaluations
of waterfront areas by users show similar features that non-waterfront
urban design elements contain the highest degree of importance. This
indicates the difference of users- cognition has dimensions of
frequency and distance, and demonstrates differences in the aspect of
importance than of satisfaction. Multi-Dimensional Scaling Method
verifies differences among their cognition. This study provides
elements to increase satisfaction of users from differences of their
cognition on design elements for waterfronts. It also suggests
implications on elements when waterfronts are built.
Abstract: Studies in neuroscience suggest that both global and
local feature information are crucial for perception and recognition of
faces. It is widely believed that local feature is less sensitive to
variations caused by illumination, expression and illumination. In
this paper, we target at designing and learning local features for face
recognition. We designed three types of local features. They are
semi-global feature, local patch feature and tangent shape feature.
The designing of semi-global feature aims at taking advantage of
global-like feature and meanwhile avoiding suppressing AdaBoost
algorithm in boosting weak classifies established from small local
patches. The designing of local patch feature targets at automatically
selecting discriminative features, and is thus different with traditional
ways, in which local patches are usually selected manually to cover
the salient facial components. Also, shape feature is considered in
this paper for frontal view face recognition. These features are
selected and combined under the framework of boosting algorithm
and cascade structure. The experimental results demonstrate that the
proposed approach outperforms the standard eigenface method and
Bayesian method. Moreover, the selected local features and
observations in the experiments are enlightening to researches in
local feature design in face recognition.
Abstract: Speech corpus is one of the major components in a
Speech Processing System where one of the primary requirements
is to recognize an input sample. The quality and details captured
in speech corpus directly affects the precision of recognition. The
current work proposes a platform for speech corpus generation using
an adaptive LMS filter and LPC cepstrum, as a part of an ANN
based Speech Recognition System which is exclusively designed to
recognize isolated numerals of Assamese language- a major language
in the North Eastern part of India. The work focuses on designing an
optimal feature extraction block and a few ANN based cooperative
architectures so that the performance of the Speech Recognition
System can be improved.
Abstract: Nowadays, OCR systems have got several
applications and are increasingly employed in daily life. Much
research has been done regarding the identification of Latin,
Japanese, and Chinese characters. However, very little investigation
has been performed regarding Farsi/Arabic characters recognition.
Probably the reason is difficulty and complexity of those characters
identification compared to the others and limitation of IT activities in
Farsi and Arabic speaking countries. In this paper, a technique has
been employed to identify isolated Farsi/Arabic characters. A chain
code based algorithm along with other significant peculiarities such
as number and location of dots and auxiliary parts, and the number of
holes existing in the isolated character has been used in this study to
identify Farsi/Arabic characters. Experimental results show the
relatively high accuracy of the method developed when it is tested on
several standard Farsi fonts.
Abstract: Face Recognition has always been a fascinating research area. It has drawn the attention of many researchers because of its various potential applications such as security systems, entertainment, criminal identification etc. Many supervised and unsupervised learning techniques have been reported so far. Principal Component Analysis (PCA), Self Organizing Maps (SOM) and Independent Component Analysis (ICA) are the three techniques among many others as proposed by different researchers for Face Recognition, known as the unsupervised techniques. This paper proposes integration of the two techniques, SOM and PCA, for dimensionality reduction and feature selection. Simulation results show that, though, the individual techniques SOM and PCA itself give excellent performance but the combination of these two can also be utilized for face recognition. Experimental results also indicate that for the given face database and the classifier used, SOM performs better as compared to other unsupervised learning techniques. A comparison of two proposed methodologies of SOM, Local and Global processing, shows the superiority of the later but at the cost of more computational time.