Abstract: The recognition of handwritten numeral is an
important area of research for its applications in post office, banks
and other organizations. This paper presents automatic recognition of
handwritten Kannada numerals based on structural features. Five
different types of features, namely, profile based 10-segment string,
water reservoir; vertical and horizontal strokes, end points and
average boundary length from the minimal bounding box are used in
the recognition of numeral. The effect of each feature and their
combination in the numeral classification is analyzed using nearest
neighbor classifiers. It is common to combine multiple categories of
features into a single feature vector for the classification. Instead,
separate classifiers can be used to classify based on each visual
feature individually and the final classification can be obtained based
on the combination of separate base classification results. One
popular approach is to combine the classifier results into a feature
vector and leaving the decision to next level classifier. This method
is extended to extract a better information, possibility distribution,
from the base classifiers in resolving the conflicts among the
classification results. Here, we use fuzzy k Nearest Neighbor (fuzzy
k-NN) as base classifier for individual feature sets, the results of
which together forms the feature vector for the final k Nearest
Neighbor (k-NN) classifier. Testing is done, using different features,
individually and in combination, on a database containing 1600
samples of different numerals and the results are compared with the
results of different existing methods.
Abstract: This paper presents a recognition system for isolated
words like robot commands. It’s carried out by Time Delay Neural
Networks; TDNN. To teleoperate a robot for specific tasks as turn,
close, etc… In industrial environment and taking into account the
noise coming from the machine. The choice of TDNN is based on its
generalization in terms of accuracy, in more it acts as a filter that
allows the passage of certain desirable frequency characteristics of
speech; the goal is to determine the parameters of this filter for
making an adaptable system to the variability of speech signal and to
noise especially, for this the back propagation technique was used in
learning phase. The approach was applied on commands pronounced
in two languages separately: The French and Arabic. The results for
two test bases of 300 spoken words for each one are 87%, 97.6% in
neutral environment and 77.67%, 92.67% when the white Gaussian
noisy was added with a SNR of 35 dB.
Abstract: In this study, the use of silicon NAM (Non-Audible
Murmur) microphone in automatic speech recognition is presented.
NAM microphones are special acoustic sensors, which are attached
behind the talker-s ear and can capture not only normal (audible)
speech, but also very quietly uttered speech (non-audible murmur).
As a result, NAM microphones can be applied in automatic speech
recognition systems when privacy is desired in human-machine communication.
Moreover, NAM microphones show robustness against
noise and they might be used in special systems (speech recognition,
speech conversion etc.) for sound-impaired people. Using a small
amount of training data and adaptation approaches, 93.9% word
accuracy was achieved for a 20k Japanese vocabulary dictation
task. Non-audible murmur recognition in noisy environments is also
investigated. In this study, further analysis of the NAM speech has
been made using distance measures between hidden Markov model
(HMM) pairs. It has been shown the reduced spectral space of NAM
speech using a metric distance, however the location of the different
phonemes of NAM are similar to the location of the phonemes
of normal speech, and the NAM sounds are well discriminated.
Promising results in using nonlinear features are also introduced,
especially under noisy conditions.
Abstract: Realistic 3D face model is desired in various
applications such as face recognition, games, avatars, animations, and
etc. Construction of 3D face model is composed of 1) building a face
shape model and 2) rendering the face shape model. Thus, building a
realistic 3D face shape model is an essential step for realistic 3D face
model. Recently, 3D morphable model is successfully introduced to
deal with the various human face shapes. 3D dense correspondence
problem should be precedently resolved for constructing a realistic 3D
dense morphable face shape model. Several approaches to 3D dense
correspondence problem in 3D face modeling have been proposed
previously, and among them optical flow based algorithms and TPS
(Thin Plate Spline) based algorithms are representative. Optical flow
based algorithms require texture information of faces, which is
sensitive to variation of illumination. In TPS based algorithms
proposed so far, TPS process is performed on the 2D projection
representation in cylindrical coordinates of the 3D face data, not
directly on the 3D face data and thus errors due to distortion in data
during 2D TPS process may be inevitable.
In this paper, we propose a new 3D dense correspondence algorithm
for 3D dense morphable face shape modeling. The proposed algorithm
does not need texture information and applies TPS directly on 3D face
data. Through construction procedures, it is observed that the proposed
algorithm constructs realistic 3D face morphable model reliably and
fast.
Abstract: Sign language is used by the deaf and hard of hearing people for communication. Automatic sign language recognition is a challenging research area since sign language often is the only way of communication for the deaf people. Sign language includes different components of visual actions made by the signer using the hands, the face, and the torso, to convey his/her meaning. To use different aspects of signs, we combine the different groups of features which have been extracted from the image frames recorded directly by a stationary camera. We combine the features in two levels by employing three techniques. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, or by concatenating feature groups over time and using LDA to choose the most discriminant elements. At the model level, a late fusion of differently trained models can be carried out by a log-linear model combination. In this paper, we investigate these three combination techniques in an automatic sign language recognition system and show that the recognition rate can be significantly improved.
Abstract: Hand gesture is an active area of research in the vision
community, mainly for the purpose of sign language recognition and
Human Computer Interaction. In this paper, we propose a system to
recognize alphabet characters (A-Z) and numbers (0-9) in real-time
from stereo color image sequences using Hidden Markov Models
(HMMs). Our system is based on three main stages; automatic segmentation
and preprocessing of the hand regions, feature extraction
and classification. In automatic segmentation and preprocessing stage,
color and 3D depth map are used to detect hands where the hand
trajectory will take place in further step using Mean-shift algorithm
and Kalman filter. In the feature extraction stage, 3D combined features
of location, orientation and velocity with respected to Cartesian
systems are used. And then, k-means clustering is employed for
HMMs codeword. The final stage so-called classification, Baum-
Welch algorithm is used to do a full train for HMMs parameters.
The gesture of alphabets and numbers is recognized using Left-Right
Banded model in conjunction with Viterbi algorithm. Experimental
results demonstrate that, our system can successfully recognize hand
gestures with 98.33% recognition rate.
Abstract: In this paper, a novel algorithm based on Ridgelet
Transform and support vector machine is proposed for human action
recognition. The Ridgelet transform is a directional multi-resolution
transform and it is more suitable for describing the human action by
performing its directional information to form spatial features
vectors. The dynamic transition between the spatial features is carried
out using both the Principal Component Analysis and clustering
algorithm K-means. First, the Principal Component Analysis is used
to reduce the dimensionality of the obtained vectors. Then, the kmeans
algorithm is then used to perform the obtained vectors to form
the spatio-temporal pattern, called set-of-labels, according to given
periodicity of human action. Finally, a Support Machine classifier is
used to discriminate between the different human actions. Different
tests are conducted on popular Datasets, such as Weizmann and
KTH. The obtained results show that the proposed method provides
more significant accuracy rate and it drives more robustness in very
challenging situations such as lighting changes, scaling and dynamic
environment
Abstract: This paper discusses the Urdu script characteristics,
Urdu Nastaleeq and a simple but a novel and robust technique to
recognize the printed Urdu script without a lexicon. Urdu being a
family of Arabic script is cursive and complex script in its nature, the
main complexity of Urdu compound/connected text is not its
connections but the forms/shapes the characters change when it is
placed at initial, middle or at the end of a word. The characters
recognition technique presented here is using the inherited
complexity of Urdu script to solve the problem. A word is scanned
and analyzed for the level of its complexity, the point where the level
of complexity changes is marked for a character, segmented and
feeded to Neural Networks. A prototype of the system has been
tested on Urdu text and currently achieves 93.4% accuracy on the
average.
Abstract: A word recognition architecture based on a network
of neural associative memories and hidden Markov models has been
developed. The input stream, composed of subword-units like wordinternal
triphones consisting of diphones and triphones, is provided
to the network of neural associative memories by hidden Markov
models. The word recognition network derives words from this input
stream. The architecture has the ability to handle ambiguities on
subword-unit level and is also able to add new words to the
vocabulary during performance. The architecture is implemented to
perform the word recognition task in a language processing system
for understanding simple command sentences like “bot show apple".
Abstract: We developed a vision interface immersive projection system, CAVE in virtual rea using hand gesture recognition with computer vis background image was subtracted from current webcam and we convert the color space of the imag Then we mask skin regions using skin color range t a noise reduction operation. We made blobs fro gestures were recognized using these blobs. Using recognition, we could implement an effective bothering devices for CAVE. e framework for an reality research field vision techniques. ent image frame age into HSV space. e threshold and apply from the image and ing our hand gesture e interface without
Abstract: This paper proposes a location-aware system for
household robots which allows users to paste predefined paper tags at
different locations according to users- comprehension of the house. In this system a household robot may be aware of its location and the
attributes thereof by visually recognizing the tags when the robot is moving. This paper also presents a novel user interface to define a
moving path of the robot, which allows users to draw the path in the air
with a finger so as to generate commands for following motions.
Abstract: This paper deals with automatic sentence modality
recognition in French. In this work, only prosodic features are
considered. The sentences are recognized according to the three
following modalities: declarative, interrogative and exclamatory
sentences. This information will be used to animate a talking head for
deaf and hearing-impaired children. We first statistically study a real
radio corpus in order to assess the feasibility of the automatic
modeling of sentence types. Then, we test two sets of prosodic
features as well as two different classifiers and their combination. We
further focus our attention on questions recognition, as this modality
is certainly the most important one for the target application.
Abstract: In this paper we illuminate a frequency domain based
classification method for video scenes. Videos from certain topical
areas often contain activities with repeating movements. Sports
videos, home improvement videos, or videos showing mechanical
motion are some example areas. Assessing main and side frequencies
of each repeating movement gives rise to the motion type. We
obtain the frequency domain by transforming spatio-temporal motion
trajectories. Further on we explain how to compute frequency features
for video clips and how to use them for classifying. The focus of
the experimental phase is on transforms utilized for our system.
By comparing various transforms, experiments show the optimal
transform for a motion frequency based approach.
Abstract: In this paper we examine the use of global texture analysis based approaches for the purpose of Persian font recognition in machine-printed document images. Most existing methods for font recognition make use of local typographical features and connected component analysis. However derivation of such features is not an easy task. Gabor filters are appropriate tools for texture analysis and are motivated by human visual system. Here we consider document images as textures and use Gabor filter responses for identifying the fonts. The method is content independent and involves no local feature analysis. Two different classifiers Weighted Euclidean Distance and SVM are used for the purpose of classification. Experiments on seven different type faces and four font styles show average accuracy of 85% with WED and 82% with SVM classifier over typefaces
Abstract: Vision based tracking problem is solved through a
combination of optical flow, MACH filter and log r-θ mapping.
Optical flow is used for detecting regions of movement in video
frames acquired under variable lighting conditions. The region of
movement is segmented and then searched for the target. A template
is used for target recognition on the segmented regions for detecting
the region of interest. The template is trained offline on a sequence of
target images that are created using the MACH filter and log r-θ
mapping. The template is applied on areas of movement in
successive frames and strong correlation is seen for in-class targets.
Correlation peaks above a certain threshold indicate the presence of
target and the target is tracked over successive frames.
Abstract: This article presents the results using a parametric approach and a Wavelet Transform in analysing signals emitting from the sperm whale. The extraction of intrinsic characteristics of these unique signals emitted by marine mammals is still at present a difficult exercise for various reasons: firstly, it concerns non-stationary signals, and secondly, these signals are obstructed by interfering background noise. In this article, we compare the advantages and disadvantages of both methods: Auto Regressive models and Wavelet Transform. These approaches serve as an alternative to the commonly used estimators which are based on the Fourier Transform for which the hypotheses necessary for its application are in certain cases, not sufficiently proven. These modern approaches provide effective results particularly for the periodic tracking of the signal's characteristics and notably when the signal-to-noise ratio negatively effects signal tracking. Our objectives are twofold. Our first goal is to identify the animal through its acoustic signature. This includes recognition of the marine mammal species and ultimately of the individual animal (within the species). The second is much more ambitious and directly involves the intervention of cetologists to study the sounds emitted by marine mammals in an effort to characterize their behaviour. We are working on an approach based on the recordings of marine mammal signals and the findings from this data result from the Wavelet Transform. This article will explore the reasons for using this approach. In addition, thanks to the use of new processors, these algorithms once heavy in calculation time can be integrated in a real-time system.
Abstract: Traditional object segmentation methods are time consuming and computationally difficult. In this paper, onedimensional object detection along the secant lines is applied. Statistical features of texture images are computed for the recognition process. Example matrices of these features and formulae for calculation of similarities between two feature patterns are expressed. And experiments are also carried out using these features.
Abstract: Gesture recognition is a challenging task for extracting
meaningful gesture from continuous hand motion. In this paper, we propose an automatic system that recognizes isolated gesture,
in addition meaningful gesture from continuous hand motion for Arabic numbers from 0 to 9 in real-time based on Hidden Markov Models (HMM). In order to handle isolated gesture, HMM using
Ergodic, Left-Right (LR) and Left-Right Banded (LRB) topologies is applied over the discrete vector feature that is extracted from stereo
color image sequences. These topologies are considered to different
number of states ranging from 3 to 10. A new system is developed to recognize the meaningful gesture based on zero-codeword detection
with static velocity motion for continuous gesture. Therefore, the
LRB topology in conjunction with Baum-Welch (BW) algorithm for
training and forward algorithm with Viterbi path for testing presents the best performance. Experimental results show that the proposed system can successfully recognize isolated and meaningful gesture and achieve average rate recognition 98.6% and 94.29% respectively.
Abstract: Artificial Intelligence (AI) methods are increasingly being used for problem solving. This paper concerns using AI-type learning machines for power quality problem, which is a problem of general interest to power system to provide quality power to all appliances. Electrical power of good quality is essential for proper operation of electronic equipments such as computers and PLCs. Malfunction of such equipment may lead to loss of production or disruption of critical services resulting in huge financial and other losses. It is therefore necessary that critical loads be supplied with electricity of acceptable quality. Recognition of the presence of any disturbance and classifying any existing disturbance into a particular type is the first step in combating the problem. In this work two classes of AI methods for Power quality data mining are studied: Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs). We show that SVMs are superior to ANNs in two critical respects: SVMs train and run an order of magnitude faster; and SVMs give higher classification accuracy.
Abstract: Observations and long-term trends indicate that climate
change impacts would be significant and affects Taiwan directly and
severely. Taiwan engages not only in mitigation, but also in adaptation.
However, there are cognitive gaps on adaptation between government
and populace. Besides, a vision of zero-carbon and renewable energy
100% will be adopted in future. Therefore, the objectives of this
article are to 1) hold a National Forum for knowing differences
between the strategies of zero-carbon and renewable energy 100% and
cognitions of general populace, and 2) plan a clear roadmap for the
vision, strategy, and measures. In this forum, we set 5 group topics, 5
presumed themes, and issues mentioned review for concluding the
critical issues. Finally, there are 4 strategies and 14 critical issues
which correlate with the vision and strategy of government and the
cognition of the general populace.