Abstract: The electronically available Urdu data is in image form
which is very difficult to process. Printed Urdu data is the root cause
of problem. So for the rapid progress of Urdu language we need an
OCR systems, which can help us to make Urdu data available for the
common person. Research has been carried out for years to automata
Arabic and Urdu script. But the biggest hurdle in the development of
Urdu OCR is the challenge to recognize Nastalique Script which is
taken as standard for writing Urdu language. Nastalique script is
written diagonally with no fixed baseline which makes the script
somewhat complex. Overlap is present not only in characters but in
the ligatures as well. This paper proposes a method which allows
successful recognition of Nastalique Script.
Abstract: We present a hybrid architecture of recurrent neural
networks (RNNs) inspired by hidden Markov models (HMMs). We
train the hybrid architecture using genetic algorithms to learn and
represent dynamical systems. We train the hybrid architecture on a
set of deterministic finite-state automata strings and observe the
generalization performance of the hybrid architecture when presented
with a new set of strings which were not present in the training data
set. In this way, we show that the hybrid system of HMM and RNN
can learn and represent deterministic finite-state automata. We ran
experiments with different sets of population sizes in the genetic
algorithm; we also ran experiments to find out which weight
initializations were best for training the hybrid architecture. The
results show that the hybrid architecture of recurrent neural networks
inspired by hidden Markov models can train and represent dynamical
systems. The best training and generalization performance is
achieved when the hybrid architecture is initialized with random real
weight values of range -15 to 15.
Abstract: The prediction of transmembrane helical segments
(TMHs) in membrane proteins is an important field in the
bioinformatics research. In this paper, a method based on discrete
wavelet transform (DWT) has been developed to predict the number
and location of TMHs in membrane proteins. PDB coded as 1F88 was
chosen as an example to describe the prediction of the number and
location of TMHs in membrane proteins by using this method. One
group of test data sets that contain total 19 protein sequences was
utilized to access the effect of this method. Compared with the
prediction results of DAS, PRED-TMR2, SOSUI, HMMTOP2.0 and
TMHMM2.0, the obtained results indicate that the presented method
has higher prediction accuracy.
Abstract: Discrimination between different classes of environmental
sounds is the goal of our work. The use of a sound recognition
system can offer concrete potentialities for surveillance and
security applications. The first paper contribution to this research
field is represented by a thorough investigation of the applicability
of state-of-the-art audio features in the domain of environmental
sound recognition. Additionally, a set of novel features obtained by
combining the basic parameters is introduced. The quality of the
features investigated is evaluated by a HMM-based classifier to which
a great interest was done. In fact, we propose to use a Multi-Style
training system based on HMMs: one recognizer is trained on a
database including different levels of background noises and is used
as a universal recognizer for every environment. In order to enhance
the system robustness by reducing the environmental variability, we
explore different adaptation algorithms including Maximum Likelihood
Linear Regression (MLLR), Maximum A Posteriori (MAP)
and the MAP/MLLR algorithm that combines MAP and MLLR.
Experimental evaluation shows that a rather good recognition rate
can be reached, even under important noise degradation conditions
when the system is fed by the convenient set of features.
Abstract: Performance of any continuous speech recognition system is highly dependent on performance of the acoustic models. Generally, development of the robust spoken language technology relies on the availability of large amounts of data. Common way to cope with little data for training each state of Markov models is treebased state tying. This tying method applies contextual questions to tie states. Manual procedure for question generation suffers from human errors and is time consuming. Various automatically generated questions are used to construct decision tree. There are three approaches to generate questions to construct HMMs based on decision tree. One approach is based on misrecognized phonemes, another approach basically uses feature table and the other is based on state distributions corresponding to context-independent subword units. In this paper, all these methods of automatic question generation are applied to the decision tree on FARSDAT corpus in Persian language and their results are compared with those of manually generated questions. The results show that automatically generated questions yield much better results and can replace manually generated questions in Persian language.
Abstract: New generalization of the new class matrix polynomial set have been obtained. An explicit representation and an expansion of the matrix exponential in a series of these matrix are given for these matrix polynomials.
Abstract: Hidden Markov Model (HMM) is a stochastic method
which has been used in various signal processing and character
recognition. This study proposes to use HMM to recognize Javanese
characters from a number of different handwritings, whereby HMM
is used to optimize the number of state and feature extraction. An
85.7 % accuracy is obtained as the best result in 16-stated vertical
model using pure HMM. This initial result is satisfactory for
prompting further research.
Abstract: Despite the fact that Arabic language is currently one
of the most common languages worldwide, there has been only a
little research on Arabic speech recognition relative to other
languages such as English and Japanese. Generally, digital speech
processing and voice recognition algorithms are of special
importance for designing efficient, accurate, as well as fast automatic
speech recognition systems. However, the speech recognition process
carried out in this paper is divided into three stages as follows: firstly,
the signal is preprocessed to reduce noise effects. After that, the
signal is digitized and hearingized. Consequently, the voice activity
regions are segmented using voice activity detection (VAD)
algorithm. Secondly, features are extracted from the speech signal
using Mel-frequency cepstral coefficients (MFCC) algorithm.
Moreover, delta and acceleration (delta-delta) coefficients have been
added for the reason of improving the recognition accuracy. Finally,
each test word-s features are compared to the training database using
dynamic time warping (DTW) algorithm. Utilizing the best set up
made for all affected parameters to the aforementioned techniques,
the proposed system achieved a recognition rate of about 98.5%
which outperformed other HMM and ANN-based approaches
available in the literature.
Abstract: An automatic speech recognition system for the
formal Arabic language is needed. The Quran is the most formal
spoken book in Arabic, it is spoken all over the world. In this
research, an automatic speech recognizer for Quranic based speakerindependent
was developed and tested. The system was developed
based on the tri-phone Hidden Markov Model and Maximum
Likelihood Linear Regression (MLLR). The MLLR computes a set
of transformations which reduces the mismatch between an initial
model set and the adaptation data. It uses the regression class tree, as
well as, estimates a set of linear transformations for the mean and
variance parameters of a Gaussian mixture HMM system. The 30th
Chapter of the Quran, with five of the most famous readers of the
Quran, was used for the training and testing of the data. The chapter
includes about 2000 distinct words. The advantages of using the
Quranic verses as the database in this developed recognizer are the
uniqueness of the words and the high level of orderliness between
verses. The level of accuracy from the tested data ranged 68 to 85%.
Abstract: In developing a text-to-speech system, it is well
known that the accuracy of information extracted from a text is
crucial to produce high quality synthesized speech. In this paper, a
new scheme for converting text into its equivalent phonetic spelling
is introduced and developed. This method is applicable to many
applications in text to speech converting systems and has many
advantages over other methods. The proposed method can also
complement the other methods with a purpose of improving their
performance. The proposed method is a probabilistic model and is
based on Smooth Ergodic Hidden Markov Model. This model can be
considered as an extension to HMM. The proposed method is applied
to Persian language and its accuracy in converting text to speech
phonetics is evaluated using simulations.
Abstract: Electromyography (EMG) is the study of muscles function through analysis of electrical activity produced from muscles. This electrical activity which is displayed in the form of signal is the result of neuromuscular activation associated with muscle contraction. The most common techniques of EMG signal recording are by using surface and needle/wire electrode where the latter is usually used for interest in deep muscle. This paper will focus on surface electromyogram (SEMG) signal. During SEMG recording, several problems had to been countered such as noise, motion artifact and signal instability. Thus, various signal processing techniques had been implemented to produce a reliable signal for analysis. SEMG signal finds broad application particularly in biomedical field. It had been analyzed and studied for various interests such as neuromuscular disease, enhancement of muscular function and human-computer interface.
Abstract: The Automatic Speech Recognition (ASR) applied to
Arabic language is a challenging task. This is mainly related to the
language specificities which make the researchers facing multiple
difficulties such as the insufficient linguistic resources and the very
limited number of available transcribed Arabic speech corpora. In
this paper, we are interested in the development of a HMM-based
ASR system for Standard Arabic (SA) language. Our fundamental
research goal is to select the most appropriate acoustic parameters
describing each audio frame, acoustic models and speech recognition
unit. To achieve this purpose, we analyze the effect of varying frame
windowing (size and period), acoustic parameter number resulting
from features extraction methods traditionally used in ASR, speech
recognition unit, Gaussian number per HMM state and number of
embedded re-estimations of the Baum-Welch Algorithm. To evaluate
the proposed ASR system, a multi-speaker SA connected-digits
corpus is collected, transcribed and used throughout all experiments.
A further evaluation is conducted on a speaker-independent continue
SA speech corpus. The phonemes recognition rate is 94.02% which is
relatively high when comparing it with another ASR system
evaluated on the same corpus.
Abstract: A fusion classifier composed of two modules, one made by a hidden Markov model (HMM) and the other by a support vector machine (SVM), is proposed to recognize faces with pose variations in open-set recognition settings. The HMM module captures the evolution of facial features across a subject-s face using the subject-s facial images only, without referencing to the faces of others. Because of the captured evolutionary process of facial features, the HMM module retains certain robustness against pose variations, yielding low false rejection rates (FRR) for recognizing faces across poses. This is, however, on the price of poor false acceptance rates (FAR) when recognizing other faces because it is built upon withinclass samples only. The SVM module in the proposed model is developed following a special design able to substantially diminish the FAR and further lower down the FRR. The proposed fusion classifier has been evaluated in performance using the CMU PIE database, and proven effective for open-set face recognition with pose variations. Experiments have also shown that it outperforms the face classifier made by HMM or SVM alone.
Abstract: Two algorithms are proposed to reduce the storage requirements for mammogram images. The input image goes through a shrinking process that converts the 16-bit images to 8-bits by using pixel-depth conversion algorithm followed by enhancement process. The performance of the algorithms is evaluated objectively and subjectively. A 50% reduction in size is obtained with no loss of significant data at the breast region.
Abstract: The most widely used semiconductor memory types
are the Dynamic Random Access Memory (DRAM) and Static
Random Access memory (SRAM). Competition among memory
manufacturers drives the need to decrease power consumption and
reduce the probability of read failure. A technology that is relatively
new and has not been explored is the FinFET technology. In this
paper, a single cell Schmitt Trigger Based Static RAM using FinFET
technology is proposed and analyzed. The accuracy of the result is
validated by means of HSPICE simulations with 32nm FinFET
technology and the results are then compared with 6T SRAM using
the same technology.
Abstract: This paper presents a formant-tracking linear prediction
(FTLP) model for speech processing in noise. The main focus of this
work is the detection of formant trajectory based on Hidden Markov
Models (HMM), for improved formant estimation in noise. The
approach proposed in this paper provides a systematic framework for
modelling and utilization of a time- sequence of peaks which satisfies
continuity constraints on parameter; the within peaks are modelled
by the LP parameters. The formant tracking LP model estimation
is composed of three stages: (1) a pre-cleaning multi-band spectral
subtraction stage to reduce the effect of residue noise on formants
(2) estimation stage where an initial estimate of the LP model of
speech for each frame is obtained (3) a formant classification using
probability models of formants and Viterbi-decoders. The evaluation
results for the estimation of the formant tracking LP model tested
in Gaussian white noise background, demonstrate that the proposed
combination of the initial noise reduction stage with formant tracking
and LPC variable order analysis, results in a significant reduction in
errors and distortions. The performance was evaluated with noisy
natual vowels extracted from international french and English vocabulary
speech signals at SNR value of 10dB. In each case, the
estimated formants are compared to reference formants.
Abstract: In this study, an OCR system for segmentation,
feature extraction and recognition of Ottoman Scripts has been
developed using handwritten characters. Detection of handwritten
characters written by humans is a difficult process. Segmentation and
feature extraction stages are based on geometrical feature analysis,
followed by the chain code transformation of the main strokes of
each character. The output of segmentation is well-defined segments
that can be fed into any classification approach. The classes of main
strokes are identified through left-right Hidden Markov Model
(HMM).
Abstract: Automatic Extraction of Event information from
social text stream (emails, social network sites, blogs etc) is a vital
requirement for many applications like Event Planning and
Management systems and security applications. The key information
components needed from Event related text are Event title, location,
participants, date and time. Emails have very unique distinctions over
other social text streams from the perspective of layout and format
and conversation style and are the most commonly used
communication channel for broadcasting and planning events.
Therefore we have chosen emails as our dataset. In our work, we
have employed two statistical NLP methods, named as Finite State
Machines (FSM) and Hidden Markov Model (HMM) for the
extraction of event related contextual information. An application
has been developed providing a comparison among the two methods
over the event extraction task. It comprises of two modules, one for
each method, and works for both bulk as well as direct user input.
The results are evaluated using Precision, Recall and F-Score.
Experiments show that both methods produce high performance and
accuracy, however HMM was good enough over Title extraction and
FSM proved to be better for Venue, Date, and time.
Abstract: The increasingly sophisticated technologies have now been able to provide assistance for surgeons to improve surgical
performance through various training programs. Equally important to learning skills is the assessment method as it determines the learning and technical proficiency of a trainee. A consistent and
rigorous assessment system will ensure that trainees acquire the specific level of competency prior to certification. This paper
reviews the methods currently in use for assessment of surgical
skill and some modern techniques using computer-based
measurements and virtual reality systems for more quantitative
measurements
Abstract: Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.