Abstract: Efficient preprocessing is very essential for automatic
recognition of handwritten documents. In this paper, techniques on
segmenting words in handwritten Arabic text are presented. Firstly,
connected components (ccs) are extracted, and distances among
different components are analyzed. The statistical distribution of this
distance is then obtained to determine an optimal threshold for words
segmentation. Meanwhile, an improved projection based method is
also employed for baseline detection. The proposed method has been
successfully tested on IFN/ENIT database consisting of 26459
Arabic words handwritten by 411 different writers, and the results
were promising and very encouraging in more accurate detection of
the baseline and segmentation of words for further recognition.
Abstract: Argument over the use of particular method in interlanguage pragmatics has increased recently. Researchers argued the advantages and disadvantages of each method either natural or elicited. Findings of different studies indicated that the use of one method may not provide enough data to answer all its questions. The current study investigated the validity of using multimethod approach in interlanguage pragmatics to understand the development of requests in Arabic as a second language (Arabic L2). To this end, the study adopted two methods belong to two types of data sources: the institutional discourse (natural data), and the role play (elicited data). Participants were 117 learners of Arabic L2 at the university level, representing four levels (beginners, low-intermediate, highintermediate, and advanced). Results showed that using two or more methods in interlanguage pragmatics affect the size and nature of data.
Abstract: the cursive nature of the Arabic writing makes it
difficult to accurately segment characters or even deal with the whole
word efficiently. Therefore, in this paper, a printed Arabic sub-word
recognition system is proposed. The suggested algorithm utilizes
geometrical moments as descriptors for the separated sub-words.
Three types of moments are investigated and applied to the printed
sub-word images after dividing each image into multiple parts using
windowing. Since moments are global descriptors, the windowing
mechanism allows the moments to be applied to local regions of the
sub-word. The local-global mixture of the proposed scheme increases
the discrimination power of the moments while keeping the
simplicity and ease of use of moments.
Abstract: The Automatic Speech Recognition (ASR) applied to
Arabic language is a challenging task. This is mainly related to the
language specificities which make the researchers facing multiple
difficulties such as the insufficient linguistic resources and the very
limited number of available transcribed Arabic speech corpora. In
this paper, we are interested in the development of a HMM-based
ASR system for Standard Arabic (SA) language. Our fundamental
research goal is to select the most appropriate acoustic parameters
describing each audio frame, acoustic models and speech recognition
unit. To achieve this purpose, we analyze the effect of varying frame
windowing (size and period), acoustic parameter number resulting
from features extraction methods traditionally used in ASR, speech
recognition unit, Gaussian number per HMM state and number of
embedded re-estimations of the Baum-Welch Algorithm. To evaluate
the proposed ASR system, a multi-speaker SA connected-digits
corpus is collected, transcribed and used throughout all experiments.
A further evaluation is conducted on a speaker-independent continue
SA speech corpus. The phonemes recognition rate is 94.02% which is
relatively high when comparing it with another ASR system
evaluated on the same corpus.
Abstract: This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization features: the Burg Spectrum Analysis, the Walsh Spectrum Analysis, the Thomson Multitaper Spectrum Analysis and the Mel Frequency Cepstral Coefficients (MFCC) features. The main aim of this paper was to compare, analyze, and discuss the outcomes of spoken Arabic digits recognition systems based on the selected recognition features. The results acqired confirm that the use of MFCC features is a very promising method in recognizing Spoken Arabic digits.
Abstract: This paper presents a new steganography approach suitable for Arabic texts. It can be classified under steganography feature coding methods. The approach hides secret information bits within the letters benefiting from their inherited points. To note the specific letters holding secret bits, the scheme considers the two features, the existence of the points in the letters and the redundant Arabic extension character. We use the pointed letters with extension to hold the secret bit 'one' and the un-pointed letters with extension to hold 'zero'. This steganography technique is found attractive to other languages having similar texts to Arabic such as Persian and Urdu.
Abstract: A comparison between the performance of Latin and
Arabic handwritten digits recognition problems is presented. The
performance of ten different classifiers is tested on two similar
Arabic and Latin handwritten digits databases. The analysis shows
that Arabic handwritten digits recognition problem is easier than that
of Latin digits. This is because the interclass difference in case of
Latin digits is smaller than in Arabic digits and variances in writing
Latin digits are larger. Consequently, weaker yet fast classifiers are
expected to play more prominent role in Arabic handwritten digits
recognition.
Abstract: The AL-MAJIRI school system is a variant of private
Arabic and Islamic schools which cater for the religious and moral development of Muslims. In the past, the system produced clerics,
scholars, judges, religious reformers, eminent teachers and great men who are worthy of emulation, particularly in northern Nigeria.
Gradually, the system lost its glory but continued to discharge its
educational responsibilities to a certain extent. This paper takes a
look at the activities of the AL-MAJIRI schools. The introduction
provides background information about Nigeria where the schools
operate. This is followed by an overview of the Nigerian educational system, the nature and the features of the AL-MAJIRI school system,
its weaknesses and the current challenges facing the schools. The paper concludes with emphasis on the urgent need for a comprehensive reform of the curriculum content of the schools. The step by step procedure required for the reform is discussed.
Abstract: Australian government agencies have a natural desire
to provide migrants a wide range of opportunities. Consequently,
government online services should be equally available to migrants
with a non-English speaking background (NESB). Despite the
commendable efforts of governments and local agencies in Australia
to provide such services, in reality, many NESB communities are not
taking advantage of these services. This article–based on an
extensive case study regarding the use of online government services
by the Arabic NESB community in Australia–reports on the
possible reasons for this issue, as well as suggestions for
improvement. The conclusion is that Australia should implement
ICT-based or e-government policies, programmes, and services that
more accurately reflect migrant cultures and languages so that
migrant integration can be more fully accomplished. Specifically, this
article presents an NESB Model that adopts the value of usercentricity
or a more individual-focused approach to government
online services in Australia.
Abstract: This paper presents a new approach to tackle the problem of recognizing machine-printed Arabic texts. Because of the difficulty of recognizing cursive Arabic words, the text has to be normalized and segmented to be ready for the recognition stage. The new scheme for recognizing Arabic characters depends on multiple parallel neural networks classifier. The classifier has two phases. The first phase categories the input character into one of eight groups. The second phase classifies the character into one of the Arabic character classes in the group. The system achieved high recognition rate.
Abstract: The development of the signal compression
algorithms is having compressive progress. These algorithms are
continuously improved by new tools and aim to reduce, an average,
the number of bits necessary to the signal representation by means of
minimizing the reconstruction error. The following article proposes
the compression of Arabic speech signal by a hybrid method
combining the wavelet transform and the linear prediction. The
adopted approach rests, on one hand, on the original signal
decomposition by ways of analysis filters, which is followed by the
compression stage, and on the other hand, on the application of the
order 5, as well as, the compression signal coefficients. The aim of
this approach is the estimation of the predicted error, which will be
coded and transmitted. The decoding operation is then used to
reconstitute the original signal. Thus, the adequate choice of the
bench of filters is useful to the transform in necessary to increase the
compression rate and induce an impercevable distortion from an
auditive point of view.
Abstract: The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when the estimation of HMM parameters is performed. The DVQ technique is implemented through two variants. The first variant uses the K-means algorithm (K-means- DVQ) to optimize the VQ, while the second variant exploits the benefits of the classification behavior of neural networks (NN-DVQ) for the same purpose. The proposed variants are compared with the HMM-based baseline system by experiments of specific Arabic consonants recognition. The results show that the distributed vector quantization technique increase the performance of the discrete HMM system.
Abstract: This paper presents a recognition system for isolated
words like robot commands. It’s carried out by Time Delay Neural
Networks; TDNN. To teleoperate a robot for specific tasks as turn,
close, etc… In industrial environment and taking into account the
noise coming from the machine. The choice of TDNN is based on its
generalization in terms of accuracy, in more it acts as a filter that
allows the passage of certain desirable frequency characteristics of
speech; the goal is to determine the parameters of this filter for
making an adaptable system to the variability of speech signal and to
noise especially, for this the back propagation technique was used in
learning phase. The approach was applied on commands pronounced
in two languages separately: The French and Arabic. The results for
two test bases of 300 spoken words for each one are 87%, 97.6% in
neutral environment and 77.67%, 92.67% when the white Gaussian
noisy was added with a SNR of 35 dB.
Abstract: This paper discusses the Urdu script characteristics,
Urdu Nastaleeq and a simple but a novel and robust technique to
recognize the printed Urdu script without a lexicon. Urdu being a
family of Arabic script is cursive and complex script in its nature, the
main complexity of Urdu compound/connected text is not its
connections but the forms/shapes the characters change when it is
placed at initial, middle or at the end of a word. The characters
recognition technique presented here is using the inherited
complexity of Urdu script to solve the problem. A word is scanned
and analyzed for the level of its complexity, the point where the level
of complexity changes is marked for a character, segmented and
feeded to Neural Networks. A prototype of the system has been
tested on Urdu text and currently achieves 93.4% accuracy on the
average.
Abstract: Gesture recognition is a challenging task for extracting
meaningful gesture from continuous hand motion. In this paper, we propose an automatic system that recognizes isolated gesture,
in addition meaningful gesture from continuous hand motion for Arabic numbers from 0 to 9 in real-time based on Hidden Markov Models (HMM). In order to handle isolated gesture, HMM using
Ergodic, Left-Right (LR) and Left-Right Banded (LRB) topologies is applied over the discrete vector feature that is extracted from stereo
color image sequences. These topologies are considered to different
number of states ranging from 3 to 10. A new system is developed to recognize the meaningful gesture based on zero-codeword detection
with static velocity motion for continuous gesture. Therefore, the
LRB topology in conjunction with Baum-Welch (BW) algorithm for
training and forward algorithm with Viterbi path for testing presents the best performance. Experimental results show that the proposed system can successfully recognize isolated and meaningful gesture and achieve average rate recognition 98.6% and 94.29% respectively.
Abstract: Nowadays, OCR systems have got several
applications and are increasingly employed in daily life. Much
research has been done regarding the identification of Latin,
Japanese, and Chinese characters. However, very little investigation
has been performed regarding Farsi/Arabic characters recognition.
Probably the reason is difficulty and complexity of those characters
identification compared to the others and limitation of IT activities in
Farsi and Arabic speaking countries. In this paper, a technique has
been employed to identify isolated Farsi/Arabic characters. A chain
code based algorithm along with other significant peculiarities such
as number and location of dots and auxiliary parts, and the number of
holes existing in the isolated character has been used in this study to
identify Farsi/Arabic characters. Experimental results show the
relatively high accuracy of the method developed when it is tested on
several standard Farsi fonts.
Abstract: In this paper, an efficient structural approach for
recognizing on-line handwritten digits is proposed. After reading
the digit from the user, the slope is estimated and normalized for
adjacent nodes. Based on the changing of signs of the slope values,
the primitives are identified and extracted. The names of these
primitives are represented by strings, and then a finite state
machine, which contains the grammars of the digits, is traced to
identify the digit. Finally, if there is any ambiguity, it will be
resolved. Experiments showed that this technique is flexible and
can achieve high recognition accuracy for the shapes of the digits
represented in this work.
Abstract: In the national and professional music of oral tradition
of many people in the East there is the metric formula called “ussuli",
that is to say rhythmic constructions of different character and a
composition. Ussuli in translation from Arabic means the law. The
cultural contacts of the ancient and medieval inhabitants of the
Central Asia, India, China, East Turkestan, Iraq, Afghanistan,
Turkey, and Iran have played a certain role in formation of both
musical and dancing heritage of each of these people. During
theatrical shows many dances were performed under the
accompaniment of percussion instruments as nagra, dayulpaz, doll.
The abovementioned tools are used as the obligatory accompanying
tool in an orchestra and at support of dancing acts as the solo tool.
Dynamics of development of a dancing composition, at times
execution of technique of movement depends on various
combinations of ussuli and their receptions of execution.
Abstract: In this paper we present an efficient system for
independent speaker speech recognition based on neural network
approach. The proposed architecture comprises two phases: a
preprocessing phase which consists in segmental normalization and
features extraction and a classification phase which uses neural
networks based on nonparametric density estimation namely the
general regression neural network (GRNN). The relative
performances of the proposed model are compared to the similar
recognition systems based on the Multilayer Perceptron (MLP), the
Recurrent Neural Network (RNN) and the well known Discrete
Hidden Markov Model (HMM-VQ) that we have achieved also.
Experimental results obtained with Arabic digits have shown that the
use of nonparametric density estimation with an appropriate
smoothing factor (spread) improves the generalization power of the
neural network. The word error rate (WER) is reduced significantly
over the baseline HMM method. GRNN computation is a successful
alternative to the other neural network and DHMM.
Abstract: In this paper we propose a novel approach for
searching eCommerce products using a mobile phone, illustrated by a
prototype eCoMobile. This approach aims to globalize the mobile
search by integrating the concept of user multilinguism into it. To
show that, we particularly deal with English and Arabic languages.
Indeed the mobile user can formulate his query on a commercial
product in either language (English/Arabic). The description of his
information need on commercial products relies on the ontology that
represents the conceptualization of the product catalogue knowledge
domain defined in both English and Arabic languages. A query
expressed on a mobile device client defines the concept that
corresponds to the name of the product followed by a set of pairs
(property, value) specifying the characteristics of the product. Once a
query is submitted it is then communicated to the server side which
analyses it and in its turn performs an http request to an eCommerce
application server (like Amazon). This latter responds by returning
an XML file representing a set of elements where each element
defines an item of the searched product with its specific
characteristics. The XML file is analyzed on the server side and then
items are displayed on the mobile device client along with its
relevant characteristics in the chosen language.