Abstract: Performance of any continuous speech recognition system is highly dependent on performance of the acoustic models. Generally, development of the robust spoken language technology relies on the availability of large amounts of data. Common way to cope with little data for training each state of Markov models is treebased state tying. This tying method applies contextual questions to tie states. Manual procedure for question generation suffers from human errors and is time consuming. Various automatically generated questions are used to construct decision tree. There are three approaches to generate questions to construct HMMs based on decision tree. One approach is based on misrecognized phonemes, another approach basically uses feature table and the other is based on state distributions corresponding to context-independent subword units. In this paper, all these methods of automatic question generation are applied to the decision tree on FARSDAT corpus in Persian language and their results are compared with those of manually generated questions. The results show that automatically generated questions yield much better results and can replace manually generated questions in Persian language.
Abstract: The tracking allows to detect the tumor affections of cervical cancer, it is particularly complex and consuming time, because it consists in seeking some abnormal cells among a cluster of normal cells. In this paper, we present our proposed computer system for helping the doctors in tracking the cervical cancer. Knowing that the diagnosis of the malignancy is based in the set of atypical morphological details of all cells, herein, we present an unsupervised genetic algorithm for the separation of cell components since the diagnosis is doing by analysis of the core and the cytoplasm. We give also the various algorithms used for computing the morphological characteristics of cells (Ratio core/cytoplasm, cellular deformity, ...) necessary for the recognition of illness.
Abstract: In this paper, the authors examine whether or not there Institute for Information and Communications Policy shows are differences of Japanese Internet users awareness to information security based on individual attributes by using analysis of variance based on non-parametric method. As a result, generally speaking, it is found that Japanese Internet users' awareness to information security is different by individual attributes. Especially, the authors verify that the users who received the information security education would have rather higher recognition concerning countermeasures than other users including self-educated users. It is suggested that the information security education should be enhanced so that the users may appropriately take the information security countermeasures. In addition, the information security policy such as carrying out "e- net caravan" and "information security seminars" are effective in improving the users' awareness on the information security in Japan.
Abstract: The various applications of VLSI circuits in highperformance
computing, telecommunications, and consumer
electronics has been expanding progressively, and at a very hasty
pace. This paper describes a new model for partitioning a circuit
using DBSCAN and fuzzy ARTMAP neural network. The first step
is concerned with feature extraction, where we had make use
DBSCAN algorithm. The second step is the classification and is
composed of a fuzzy ARTMAP neural network. The performance of
both approaches is compared using benchmark data provided by
MCNC standard cell placement benchmark netlists. Analysis of the
investigational results proved that the fuzzy ARTMAP with
DBSCAN model achieves greater performance then only fuzzy
ARTMAP in recognizing sub-circuits with lowest amount of
interconnections between them The recognition rate using fuzzy
ARTMAP with DBSCAN is 97.7% compared to only fuzzy
ARTMAP.
Abstract: In this paper, a comparative study of application of
supervised and unsupervised learning algorithms on illumination
invariant face recognition has been carried out. The supervised
learning has been carried out with the help of using a bi-layered
artificial neural network having one input, two hidden and one output
layer. The gradient descent with momentum and adaptive learning
rate back propagation learning algorithm has been used to implement
the supervised learning in a way that both the inputs and
corresponding outputs are provided at the time of training the
network, thus here is an inherent clustering and optimized learning of
weights which provide us with efficient results.. The unsupervised
learning has been implemented with the help of a modified
Counterpropagation network. The Counterpropagation network
involves the process of clustering followed by application of Outstar
rule to obtain the recognized face. The face recognition system has
been developed for recognizing faces which have varying
illumination intensities, where the database images vary in lighting
with respect to angle of illumination with horizontal and vertical
planes. The supervised and unsupervised learning algorithms have
been implemented and have been tested exhaustively, with and
without application of histogram equalization to get efficient results.
Abstract: In this paper we propose a novel approach for ascertaining human identity based on fusion of profile face and gait biometric cues The identification approach based on feature learning in PCA-LDA subspace, and classification using multivariate Bayesian classifiers allows significant improvement in recognition accuracy for low resolution surveillance video scenarios. The experimental evaluation of the proposed identification scheme on a publicly available database [2] showed that the fusion of face and gait cues in joint PCA-LDA space turns out to be a powerful method for capturing the inherent multimodality in walking gait patterns, and at the same time discriminating the person identity..
Abstract: The recognition of human faces, especially those with
different orientations is a challenging and important problem in image
analysis and classification. This paper proposes an effective scheme
for rotation invariant face recognition using Log-Polar Transform and
Discrete Cosine Transform combined features. The rotation invariant
feature extraction for a given face image involves applying the logpolar
transform to eliminate the rotation effect and to produce a row
shifted log-polar image. The discrete cosine transform is then applied
to eliminate the row shift effect and to generate the low-dimensional
feature vector. A PSO-based feature selection algorithm is utilized to
search the feature vector space for the optimal feature subset.
Evolution is driven by a fitness function defined in terms of
maximizing the between-class separation (scatter index).
Experimental results, based on the ORL face database using testing
data sets for images with different orientations; show that the
proposed system outperforms other face recognition methods. The
overall recognition rate for the rotated test images being 97%,
demonstrating that the extracted feature vector is an effective rotation
invariant feature set with minimal set of selected features.
Abstract: Document image processing has become an
increasingly important technology in the automation of office
documentation tasks. During document scanning, skew is inevitably
introduced into the incoming document image. Since the algorithm
for layout analysis and character recognition are generally very
sensitive to the page skew. Hence, skew detection and correction in
document images are the critical steps before layout analysis. In this
paper, a novel skew detection method is presented for binary
document images. The method considered the some selected
characters of the text which may be subjected to thinning and Hough
transform to estimate skew angle accurately. Several experiments
have been conducted on various types of documents such as
documents containing English Documents, Journals, Text-Book,
Different Languages and Document with different fonts, Documents
with different resolutions, to reveal the robustness of the proposed
method. The experimental results revealed that the proposed method
is accurate compared to the results of well-known existing methods.
Abstract: Pineapples can be classified using an index with seven
levels of maturity based on the green and yellow color of the skin. As
the pineapple ripens, the skin will change from pale green to a golden
or yellowish color. The issues that occur in agriculture nowadays are
to do with farmers being unable to distinguish between the indexes of
pineapple maturity correctly and effectively. There are several
reasons for why farmers cannot properly follow the guideline provide
by Federal Agriculture Marketing Authority (FAMA) and one of
reason is that due to manual inspection done by experts, there are no
specific and universal guidelines to be adopted by farmers due to the
different points of view of the experts when sorting the pineapples
based on their knowledge and experience. Therefore, an automatic
system will help farmers to identify pineapple maturity effectively
and will become a universal indicator to farmers.
Abstract: In illumination variant face recognition, existing
methods extracting face albedo as light normalized image may lead to
loss of extensive facial details, with light template discarded. To
improve that, a novel approach for realistic facial texture
reconstruction by combining original image and albedo image is
proposed. First, light subspaces of different identities are established
from the given reference face images; then by projecting the original
and albedo image into each light subspace respectively, texture
reference images with corresponding lighting are reconstructed and
two texture subspaces are formed. According to the projections in
texture subspaces, facial texture with normal light can be synthesized.
Due to the combination of original image, facial details can be
preserved with face albedo. In addition, image partition is applied to
improve the synthesization performance. Experiments on Yale B and
CMUPIE databases demonstrate that this algorithm outperforms the
others both in image representation and in face recognition.
Abstract: In this paper a new approach to face recognition is presented that achieves double dimension reduction making the system computationally efficient with better recognition results. In pattern recognition techniques, discriminative information of image increases with increase in resolution to a certain extent, consequently face recognition results improve with increase in face image resolution and levels off when arriving at a certain resolution level. In the proposed model of face recognition, first image decimation algorithm is applied on face image for dimension reduction to a certain resolution level which provides best recognition results. Due to better computational speed and feature extraction potential of Discrete Cosine Transform (DCT) it is applied on face image. A subset of coefficients of DCT from low to mid frequencies that represent the face adequately and provides best recognition results is retained. A trade of between decimation factor, number of DCT coefficients retained and recognition rate with minimum computation is obtained. Preprocessing of the image is carried out to increase its robustness against variations in poses and illumination level. This new model has been tested on different databases which include ORL database, Yale database and a color database. The proposed technique has performed much better compared to other techniques. The significance of the model is two fold: (1) dimension reduction up to an effective and suitable face image resolution (2) appropriate DCT coefficients are retained to achieve best recognition results with varying image poses, intensity and illumination level.
Abstract: The human friendly interaction is the key function of a human-centered system. Over the years, it has received much attention to develop the convenient interaction through intention recognition. Intention recognition processes multimodal inputs including speech, face images, and body gestures. In this paper, we suggest a novel approach of intention recognition using a graph representation called Intention Graph. A concept of valid intention is proposed, as a target of intention recognition. Our approach has two phases: goal recognition phase and intention recognition phase. In the goal recognition phase, we generate an action graph based on the observed actions, and then the candidate goals and their plans are recognized. In the intention recognition phase, the intention is recognized with relevant goals and user profile. We show that the algorithm has polynomial time complexity. The intention graph is applied to a simple briefcase domain to test our model.
Abstract: One astonishing capability of humans is to recognize thousands of different objects visually, and to learn the semantic association between those objects and words referring to them. This work is an attempt to build a computational model of such capacity,simulating the process by which infants learn how to recognize objects and words through exposure to visual stimuli and vocal sounds.One of the main fact shaping the brain of a newborn is that lights and colors come from entities of the world. Gradually the visual system learn which light sensations belong to same entities, despite large changes in appearance. This experience is common between humans and several other mammals, like non-human primates. But humans only can recognize a huge variety of objects, most manufactured by himself, and make use of sounds to identify and categorize them. The aim of this model is to reproduce these processes in a biologically plausible way, by reconstructing the essential hierarchy of cortical circuits on the visual and auditory neural paths.
Abstract: In face recognition, feature extraction techniques
attempts to search for appropriate representation of the data. However,
when the feature dimension is larger than the samples size, it brings
performance degradation. Hence, we propose a method called
Normalization Discriminant Independent Component Analysis
(NDICA). The input data will be regularized to obtain the most
reliable features from the data and processed using Independent
Component Analysis (ICA). The proposed method is evaluated on
three face databases, Olivetti Research Ltd (ORL), Face Recognition
Technology (FERET) and Face Recognition Grand Challenge
(FRGC). NDICA showed it effectiveness compared with other
unsupervised and supervised techniques.
Abstract: In this paper we report a study aimed at determining
the most effective animation technique for representing ASL
(American Sign Language) finger-spelling. Specifically, in the study
we compare two commonly used 3D computer animation methods
(keyframe animation and motion capture) in order to ascertain which
technique produces the most 'accurate', 'readable', and 'close to
actual signing' (i.e. realistic) rendering of ASL finger-spelling. To
accomplish this goal we have developed 20 animated clips of fingerspelled
words and we have designed an experiment consisting of a
web survey with rating questions. 71 subjects ages 19-45 participated
in the study. Results showed that recognition of the words was
correlated with the method used to animate the signs. In particular,
keyframe technique produced the most accurate representation of the
signs (i.e., participants were more likely to identify the words
correctly in keyframed sequences rather than in motion captured
ones). Further, findings showed that the animation method had an
effect on the reported scores for readability and closeness to actual
signing; the estimated marginal mean readability and closeness was
greater for keyframed signs than for motion captured signs. To our
knowledge, this is the first study aimed at measuring and comparing
accuracy, readability and realism of ASL animations produced with
different techniques.
Abstract: One problem in evaluating recent computational models of human category learning is that there is no standardized method for systematically comparing the models' assumptions or hypotheses. In the present study, a flexible general model (called GECLE) is introduced that can be used as a framework to systematically manipulate and compare the effects and descriptive validities of a limited number of assumptions at a time. Two example simulation studies are presented to show how the GECLE framework can be useful in the field of human high-order cognition research.
Abstract: Emotion recognition is an important research field that finds lots of applications nowadays. This work emphasizes on recognizing different emotions from speech signal. The extracted features are related to statistics of pitch, formants, and energy contours, as well as spectral, perceptual and temporal features, jitter, and shimmer. The Artificial Neural Networks (ANN) was chosen as the classifier. Working on finding a robust and fast ANN classifier suitable for different real life application is our concern. Several experiments were carried out on different ANN to investigate the different factors that impact the classification success rate. Using a database containing 7 different emotions, it will be shown that with a proper and careful adjustment of features format, training data sorting, number of features selected and even the ANN type and architecture used, a success rate of 85% or even more can be achieved without increasing the system complicity and the computation time
Abstract: In the package design industry, there are a lot of tacit knowledge resided within each designer. The objectives are to capture them and compile it to be used as a teaching resource and to create a video clip of package design process as well as to evaluate its quality and learning effectiveness. Interview were used as a technique for capturing knowledge in brand design concept, differentiation, recognition, rank of recognition factor, consumer survey, knowledge about marketing, research, graphic design, the effect of color, and law and regulation. Video clip about package design were created. The clip consisted of both the speech and clip of actual process. The quality of the video in term of media was ranked as good while the content was ranked as excellent. The students- score on post-test was significantly greater than that of pretest (p>0.001).
Abstract: This paper presents an ESN-based Arabic phoneme
recognition system trained with supervised, forced and combined
supervised/forced supervised learning algorithms. Mel-Frequency
Cepstrum Coefficients (MFCCs) and Linear Predictive Code (LPC)
techniques are used and compared as the input feature extraction
technique. The system is evaluated using 6 speakers from the King
Abdulaziz Arabic Phonetics Database (KAPD) for Saudi Arabia
dialectic and 34 speakers from the Center for Spoken Language
Understanding (CSLU2002) database of speakers with different
dialectics from 12 Arabic countries. Results for the KAPD and
CSLU2002 Arabic databases show phoneme recognition
performances of 72.31% and 38.20% respectively.
Abstract: Neural processors have shown good results for
detecting a certain character in a given input matrix. In this paper, a
new idead to speed up the operation of neural processors for character
detection is presented. Such processors are designed based on cross
correlation in the frequency domain between the input matrix and the
weights of neural networks. This approach is developed to reduce the
computation steps required by these faster neural networks for the
searching process. The principle of divide and conquer strategy is
applied through image decomposition. Each image is divided into
small in size sub-images and then each one is tested separately by
using a single faster neural processor. Furthermore, faster character
detection is obtained by using parallel processing techniques to test the
resulting sub-images at the same time using the same number of faster
neural networks. In contrast to using only faster neural processors, the
speed up ratio is increased with the size of the input image when using
faster neural processors and image decomposition. Moreover, the
problem of local subimage normalization in the frequency domain is
solved. The effect of image normalization on the speed up ratio of
character detection is discussed. Simulation results show that local
subimage normalization through weight normalization is faster than
subimage normalization in the spatial domain. The overall speed up
ratio of the detection process is increased as the normalization of
weights is done off line.